Skip to main content

Apache Beam

First code in Apache Beam word count example:

Word count

.java file->
package com.test.training;

import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.io.TextIO;
import org.apache.beam.sdk.options.PipelineOptions;
import org.apache.beam.sdk.options.PipelineOptionsFactory;
import org.apache.beam.sdk.transforms.Count;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.Filter;
import org.apache.beam.sdk.transforms.FlatMapElements;
import org.apache.beam.sdk.transforms.MapElements;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.values.KV;
import org.apache.beam.sdk.values.PCollection;
import org.apache.beam.sdk.values.TypeDescriptors;

public class BasicWordCountWithExplaination {

public static void main(String[] args) {
// Create a PipelineOptions object. This object lets us set various execution
    // options for our pipeline, such as the runner you wish to use. This example
    // will run with the DirectRunner by default, based on the class path configured
    // in its dependencies.
    PipelineOptions options = PipelineOptionsFactory.create();
    // Create the Pipeline object with the options we defined above
  
    Pipeline p = Pipeline.create(options);
    /*
     * go to google cloud platform
     * select storage
     * create bucket
     * upload text file add some words
     * copy location of bucket
     * paste it below
     * save and run
     */
    PCollection<String> line=p.apply(TextIO.read().from("gs://testingbucketv/text.txt"));
    PCollection<String> count=line.apply(ParDo.of(new DoFn<String,String>(){
    @ProcessElement
    public void ProcessElement(ProcessContext c)
    {
    String arr[]=c.element().split(" ");
    System.out.println("length of words"+arr.length);
    c.output("Length of words is="+String.valueOf(arr.length));
    c.output("\n\nWords are=\n"+c.element());
   
    }
    }));
    //it will create file and store length
    count.apply(TextIO.write().to("gs://testingbucketv/output.txt\")").withoutSharding());
    //without sharding for single file
    //pipeline run
    p.run();
   
}

}


/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
text.txt file
hello dear Happy diwali... thanks


/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
output.txt file
Length of words is=5


Words are=
hello dear Happy diwali... thanks




















Comments

Popular posts from this blog

Naming Convention

  Naming Convention : A naming convention is  a convention for naming things: Please find below the table for naming conventions: Naming Convention Format Example Camel Case camelCase 🐪aBcD Kebab Case kebab-case 🍢a-b-c-d Snake Case snake_case 🐍a_b_c_d Pascal Case PascalCase 🧑‍🦳AbCd Flat Case flatcase 📏abcd Upper Flat Case UPPERFLATCASE ABCD Screaming Snake Case SCREAMING_SNAKE_CASE 🐍A_B_C_D Camel Snake Case camel_Snake_Case 🐪🐍ab_Cd Pascal Snake Case Pascal_Snake_Case Ab_Cd Train Case Train-Case 🚃Ab-Cd Cobol Case COBOL-CASE 🍢AB-CD
How to run lex yacc programs on win 10 In short: For Lex steps to run: lex lex7.l gcc lex.yy.c a     For YACC steps to run: lex simple_compound.l yacc -d simple_compound.y gcc lex.yy.c y.tab.c a                          A.Lex 1. download flex for windows http://flex-windows-lex-and-yacc.software.informer.com/2.5/ 2. install it locatin:  C:\Flex Windows\ 3.open cmd 4.You need to change location as follows:  cd C:\Flex Windows\EditPlusPortable 5. Steps to run : lex lex7.l gcc lex.yy.c a Output : Word Count : 13 Char Count : 80 Line Count : 7 codes : lex7.l %{ /* *Program to count no of characters, words and lines */ #include<stdio.h> #include<string.h> int c=0,w=0,l=0; %} %% [\t ]+              /* ignore whitespace */ [a-zA-z]+  ...

Simple java code

  Simple java code class Main{ /*Every application in Java must contain the main method. The Java compiler starts executing the code from the main method*/      public static void main(String anyVar[]){      System.out.println("Simple java code");       } } Output: Simple java code Screenshot: Online compiler:https://www.onlinegdb.com/online_java_compiler #happyCoding💁👍