Skip to main content

Apache Beam

First code in Apache Beam word count example:

Word count

.java file->
package com.test.training;

import org.apache.beam.sdk.Pipeline;
import org.apache.beam.sdk.io.TextIO;
import org.apache.beam.sdk.options.PipelineOptions;
import org.apache.beam.sdk.options.PipelineOptionsFactory;
import org.apache.beam.sdk.transforms.Count;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.Filter;
import org.apache.beam.sdk.transforms.FlatMapElements;
import org.apache.beam.sdk.transforms.MapElements;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.values.KV;
import org.apache.beam.sdk.values.PCollection;
import org.apache.beam.sdk.values.TypeDescriptors;

public class BasicWordCountWithExplaination {

public static void main(String[] args) {
// Create a PipelineOptions object. This object lets us set various execution
    // options for our pipeline, such as the runner you wish to use. This example
    // will run with the DirectRunner by default, based on the class path configured
    // in its dependencies.
    PipelineOptions options = PipelineOptionsFactory.create();
    // Create the Pipeline object with the options we defined above
  
    Pipeline p = Pipeline.create(options);
    /*
     * go to google cloud platform
     * select storage
     * create bucket
     * upload text file add some words
     * copy location of bucket
     * paste it below
     * save and run
     */
    PCollection<String> line=p.apply(TextIO.read().from("gs://testingbucketv/text.txt"));
    PCollection<String> count=line.apply(ParDo.of(new DoFn<String,String>(){
    @ProcessElement
    public void ProcessElement(ProcessContext c)
    {
    String arr[]=c.element().split(" ");
    System.out.println("length of words"+arr.length);
    c.output("Length of words is="+String.valueOf(arr.length));
    c.output("\n\nWords are=\n"+c.element());
   
    }
    }));
    //it will create file and store length
    count.apply(TextIO.write().to("gs://testingbucketv/output.txt\")").withoutSharding());
    //without sharding for single file
    //pipeline run
    p.run();
   
}

}


/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
text.txt file
hello dear Happy diwali... thanks


/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
output.txt file
Length of words is=5


Words are=
hello dear Happy diwali... thanks




















Comments

Popular posts from this blog

Naming Convention

  Naming Convention : A naming convention is  a convention for naming things: Please find below the table for naming conventions: Naming Convention Format Example Camel Case camelCase 🐪aBcD Kebab Case kebab-case 🍢a-b-c-d Snake Case snake_case 🐍a_b_c_d Pascal Case PascalCase 🧑‍🦳AbCd Flat Case flatcase 📏abcd Upper Flat Case UPPERFLATCASE ABCD Screaming Snake Case SCREAMING_SNAKE_CASE 🐍A_B_C_D Camel Snake Case camel_Snake_Case 🐪🐍ab_Cd Pascal Snake Case Pascal_Snake_Case Ab_Cd Train Case Train-Case 🚃Ab-Cd Cobol Case COBOL-CASE 🍢AB-CD

Simple java code

  Simple java code class Main{ /*Every application in Java must contain the main method. The Java compiler starts executing the code from the main method*/      public static void main(String anyVar[]){      System.out.println("Simple java code");       } } Output: Simple java code Screenshot: Online compiler:https://www.onlinegdb.com/online_java_compiler #happyCoding💁👍
IOT Internet of things. In this we can operate things automatically . ---> Like fans, lights, TV, Doors, taps, all things which we need to go and put on or off. IMP point:   IOT allows things to work as per logic . Main Problem (The overflow of bins in society or cities has following impacts) Bacteria, insects and vermin thrive from garbage Overflowing waste causes air pollution and respiratory diseases. Garbage contaminates surface waters, which affects all ecosystems.  Direct handling of overflowing waste exposes for health risks.  Inefficient waste control is bad for municipal wellbeing. Their is no one to say that bin is fully occupied or please through this garbage and empty her. So IOT provides us a new concept which take care of overflow of garbage in cities. As we know without garbage city will be neat and clean. An because of this peoples,animals will be safe from any diseases. How to prevent waste bins from overflowing? Althou...