Field Options¶
The package performs several parses to facilitate the analysis of archived tweets and types of tweets. The fields below are available, which can be passed to the Parse and Export, in addition, the command line tool returns all these fields.
archived_urlkey: (str) A canonical transformation of the URL you supplied, for example,org,eserver,tc)/. Such keys are useful for indexing.archived_timestamp: (str) A 14 digit date-time representation in theYYYYMMDDhhmmssformat.parsed_archived_timestamp: (str) Thearchived_timestampin human-readable format.archived_tweet_url: (str) The archived URL.parsed_archived_tweet_url: (str) The archived URL after parsing. It is not guaranteed that this option will be archived, it is just a facilitator, as the originally archived URL does not always exist, due to changes in URLs and web services of the social network Twitter. Check the Utils.original_tweet_url: (str) The original tweet URL.parsed_tweet_url: (str) The original tweet URL after parsing. Old URLs were archived in a nested manner. The parsing applied here unnests these URLs, when necessary. Check the Utils.available_tweet_text: (str) The tweet text extracted from the URL that is still available on the Twitter account.available_tweet_is_RT: (bool) Whether the tweet from theavailable_tweet_textfield is a retweet or not.available_tweet_info: (str) Name and date of the tweet from theavailable_tweet_textfield.archived_mimetype: (str) The mimetype of the archived content, which can be one of these:text/htmlwarc/revisitapplication/jsonunk
archived_statuscode: (str) The HTTP status code of the snapshot. If the mimetype iswarc/revisit, the value returned for thestatuscodekey can be blank, but the actual value is the same as that of any other entry that has the samedigestas this entry. If the mimetype isapplication/json, the value is usually empty or-.archived_digest: (str) TheSHA1hash digest of the content, excluding the headers. It’s usually a base-32-encoded string.archived_length: (int) The compressed byte size of the corresponding WARC record, which includes WARC headers, HTTP headers, and content payload.resumption_key: (str) Allows for a simple way to scroll through the results. Key to continue the query from the end of the previous query.