A script that can be used to collect tweets on a specific query. The script reconnects every [duration] minutes and uses a document from a collection of queries as input. Based on the language of the query, a different developer account is used.
directory: The directory you want to use. If none is specified this defaults to ~/logs/. The directory will be created if it does not exist. Requires absolute path.
filename: The filename you want to use. If none is specified this defaults to collectlog.txt.
duration: Duration of the data collection before reconnecting expressed in minutes, defaults to 60.
db: Database for storage on Mongoserver. default="SOCMINT".
tcol: Collection for tweet storage in Database. default="streamtest".
qcol: Collection for stream API queries in Database. default="twitter_queries".
A script that can be used to update the query based on a time window. The script takes a time window of tweets, and extracts trending hashtags. Only tweets in the same language as the query are used to realise this (an index was created for this on the 'created_at' and the 'lang' fields of the data to speed up the search).