hadoop - Appending a sequence file in HDFS -
i have live streaming tweets need store in hdfs . can access live tweets , able extract information tweets . requirement such need append tweets single sequence file in hdfs . have thought resolve issue 2 ways . either can make single tweet store small file in hdfs , periodically can bundle them single sequence file .the second approach thought of @ run time read sequence file , append new contents sequence file .
please let me know approach should go . kindly suggest me if there better solution handling these type of use cases .
i recommend using flume. can see how tweets streamed hdfs in example: https://github.com/cloudera/cdh-twitter-example
Comments
Post a Comment