PHP Class Phirehose

Note: This is beta software - Please read the following carefully before using: - http://code.google.com/p/phirehose/wiki/Introduction - http://dev.twitter.com/pages/streaming_api
Author: Fenn Bailey ([email protected])
Show file Open project: fennb/phirehose Class Usage Examples

Protected Properties

Property Type Description
$URL_BASE
$avgElapsed integer Reset to zero after each call to statusUpdate() Highest value it should ever reach is $this->avgPeriod
$avgPeriod
$buff
$conn
$connectFailuresMax Config type vars - override in subclass if desired
$connectTimeout
$count
$enqueueSpent float Total number of seconds (fractional) spent in the enqueueStatus() calls (i.e. the customized function that handles each received tweet).
$enqueueTimeMS float Simply: enqueueSpent divided by statusCount Note: by default, calculated fresh for past 60 seconds, every 60 seconds.
$fdrPool
$filterChanged State vars
$filterCheckCount integer By default it is called every 5 seconds, so if doing statusUpdates every 60 seconds and then resetting it, this will usually be 12.
$filterCheckMin
$filterCheckSpent float Total number of seconds (fractional) spent in the checkFilterPredicates() calls
$filterCheckTimeMS float Like $enqueueTimeMS but for the checkFilterPredicates() function.
$filterUpdMin
$followIds @see http://dev.twitter.com/pages/streaming_api_methods#count
$format
$hostPort
$httpBackoff
$httpBackoffMax
$idlePeriod integer Number of seconds since the last tweet arrived (or the keep-alive newline)
$idleReconnectTimeout
$lastErrorMsg
$lastErrorNo
$locationBoxes
$maxIdlePeriod integer The maximum value $this->idlePeriod has reached.
$method
$password
$readTimeout
$reconnect
$secureHostPort
$statusCount integer Note: by default this is the sum for last 60 seconds, and is therefore reset every 60 seconds. To change this behaviour write a custom statusUpdate() function.
$statusRate integer The number of tweets received per second in previous minute; calculated fresh just before each call to statusUpdate() I.e. if fewer than 30 tweets in last minute then this will be zero; if 30 to 90 then it will be 1, if 90 to 150 then 2, etc.
$status_length_base
$tcpBackoff
$tcpBackoffMax
$trackWords
$userAgent
$username Member Attribs

Public Methods

Method Description
__construct ( string $username, string $password, string $method = Phirehose::METHOD_SAMPLE, string $format = self::FORMAT_JSON, $lang = FALSE ) Create a new Phirehose object attached to the appropriate twitter stream method.
consume ( boolean $reconnect = TRUE ) Connects to the stream API and consumes the stream. Each status update in the stream will cause a call to the handleStatus() method.
enqueueStatus ( string $status ) This is the one and only method that must be implemented additionally. As per the streaming API documentation, statuses should NOT be processed within the same process that is performing collection
getFollow ( ) : array Returns an array of followed Twitter userIds (integers)
getLang ( ) Returns the ISO 639-1 code formatted language string of the current setting. (http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).
getLastErrorMsg ( ) : string Returns the last error message (TCP or HTTP) that occured with the streaming API or client. State is cleared upon successful reconnect
getLastErrorNo ( ) : string Returns the last error number that occured with the streaming API or client. Numbers correspond to either the fsockopen() error states (in the case of TCP errors) or HTTP error codes from Twitter (in the case of HTTP errors).
getLocations ( ) : array Returns an array of 4 element arrays that denote the monitored location bounding boxes for tweets using the Geotagging API.
getTrack ( ) : array Returns an array of keywords being tracked
heartbeat ( ) : null Reports a periodic heartbeat. Keep execution time minimal.
setCount ( integer $count ) Sets the number of previous statuses to stream before transitioning to the live stream. Applies only to firehose and filter + track methods. This is generally used internally and should not be needed by client applications.
setFollow ( array $userIds ) Returns public statuses from or in reply to a set of users. Mentions ("Hello @user!") and implicit replies ("@user Hello!" created without pressing the reply button) are not matched. It is up to you to find the integer IDs of each twitter user.
setHostPort ( $port ) : void Set host port
setLang ( string $lang ) Restricts tweets to the given language, given by an ISO 639-1 code (http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).
setLocations ( array $boundingBoxes ) Specifies a set of bounding boxes to track as an array of 4 element lon/lat pairs denoting , . Only tweets that are both created using the Geotagging API and are placed from within a tracked bounding box will be included in the stream. The user's location field is not used to filter tweets. Bounding boxes are logical ORs and must be less than or equal to 1 degree per side. A locations parameter may be combined with track parameters, but note that all terms are logically ORd.
setLocationsByCircle ( $locations ) Convenience method that sets location bounding boxes by an array of lon/lat/radius sets, rather than manually specified bounding boxes. Each array element should contain 3 element subarray containing a latitude, longitude and radius. Radius is specified in kilometers and is approximate (as boxes are square).
setSecureHostPort ( integer $port ) : void Set secure host port
setTrack ( array $trackWords ) Specifies keywords to track. Track keywords are case-insensitive logical ORs. Terms are exact-matched, ignoring punctuation. Phrases, keywords with spaces, are not supported. Queries are subject to Track Limitations.

Protected Methods

Method Description
checkFilterPredicates ( ) Method called as frequently as practical (every 5+ seconds) that is responsible for checking if filter predicates (ie: track words or follow IDs) have changed. If they have, they should be set using the setTrack() and setFollow() methods respectively within the overridden implementation.
connect ( ) Connects to the stream URL using the configured method.
disconnect ( ) Performs forcible disconnect from stream (if connected) and cleanup.
getAuthorizationHeader ( $url, $requestParams )
log ( $message, String $level = 'notice' ) Basic log function that outputs logging to the standard error_log() handler. This should generally be overridden to suit the application environment.
statusUpdate ( ) Called every $this->avgPeriod (default=60) seconds, and this default implementation calculates some rates, logs them, and resets the counters.

Private Methods

Method Description
reconnect ( ) Reconnects as quickly as possible. Should be called whenever a reconnect is required rather that connect/disconnect to preserve streams reconnect state

Method Details

__construct() public method

Methods are: METHOD_FIREHOSE, METHOD_RETWEET, METHOD_SAMPLE, METHOD_FILTER, METHOD_LINKS, METHOD_USER, METHOD_SITE. Note: the method might cause the use of a different endpoint URL. Formats are: FORMAT_JSON, FORMAT_XML
See also: Phirehose::METHOD_SAMPLE
See also: Phirehose::FORMAT_JSON
public __construct ( string $username, string $password, string $method = Phirehose::METHOD_SAMPLE, string $format = self::FORMAT_JSON, $lang = FALSE )
$username string Any twitter username. When using oAuth, this is the 'oauth_token'.
$password string Any twitter password. When using oAuth this is you oAuth secret.
$method string
$format string

checkFilterPredicates() protected method

Note that even if predicates are changed every 5 seconds, an actual reconnect will not happen more frequently than every 2 minutes (as per Twitter Streaming API documentation). Note also that this method is called upon every connect attempt, so if your predicates are causing connection errors, they should be checked here and corrected. This should be implemented/overridden in any subclass implementing the FILTER method.
See also: setTrack()
See also: setFollow()
See also: Phirehose::METHOD_FILTER
protected checkFilterPredicates ( )

connect() protected method

Connects to the stream URL using the configured method.
protected connect ( )

consume() public method

Note: in normal use this function does not return. If you pass $reconnect as false, it will still not return in normal use: it will only return if the remote side (Twitter) close the socket. (Or the socket dies for some other external reason.)
See also: handleStatus()
public consume ( boolean $reconnect = TRUE )
$reconnect boolean Reconnects as per recommended

disconnect() protected method

Performs forcible disconnect from stream (if connected) and cleanup.
protected disconnect ( )

enqueueStatus() abstract public method

This is the one and only method that must be implemented additionally. As per the streaming API documentation, statuses should NOT be processed within the same process that is performing collection
abstract public enqueueStatus ( string $status )
$status string

getAuthorizationHeader() protected method

protected getAuthorizationHeader ( $url, $requestParams )

getFollow() public method

Returns an array of followed Twitter userIds (integers)
public getFollow ( ) : array
return array

getLang() public method

Returns the ISO 639-1 code formatted language string of the current setting. (http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).
public getLang ( )

getLastErrorMsg() public method

Returns the last error message (TCP or HTTP) that occured with the streaming API or client. State is cleared upon successful reconnect
public getLastErrorMsg ( ) : string
return string

getLastErrorNo() public method

State is cleared upon successful reconnect.
public getLastErrorNo ( ) : string
return string

getLocations() public method

Returns an array of 4 element arrays that denote the monitored location bounding boxes for tweets using the Geotagging API.
See also: setLocations()
public getLocations ( ) : array
return array

getTrack() public method

Returns an array of keywords being tracked
public getTrack ( ) : array
return array

heartbeat() public method

Reports a periodic heartbeat. Keep execution time minimal.
public heartbeat ( ) : null
return null

log() protected method

Basic log function that outputs logging to the standard error_log() handler. This should generally be overridden to suit the application environment.
See also: error_log()
protected log ( $message, String $level = 'notice' )
$level String 'error', 'info', 'notice'. Defaults to 'notice', so you should set this parameter on the more important error messages. 'info' is used for problems that the class should be able to recover from automatically. 'error' is for exceptional conditions that may need human intervention. (For instance, emailing them to a system administrator may make sense.)

setCount() public method

Applies to: METHOD_FILTER, METHOD_FIREHOSE, METHOD_LINKS
public setCount ( integer $count )
$count integer

setFollow() public method

Applies to: METHOD_FILTER
public setFollow ( array $userIds )
$userIds array Array of Twitter integer userIDs

setHostPort() public method

Set host port
public setHostPort ( $port ) : void
return void

setLang() public method

Restricts tweets to the given language, given by an ISO 639-1 code (http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes).
public setLang ( string $lang )
$lang string

setLocations() public method

NOTE: The argument order is Longitude/Latitude (to match the Twitter API and GeoJSON specifications). Applies to: METHOD_FILTER See: http://apiwiki.twitter.com/Streaming-API-Documentation#locations Eg: setLocations(array( array(-122.75, 36.8, -121.75, 37.8), // San Francisco array(-74, 40, -73, 41), // New York ));
public setLocations ( array $boundingBoxes )
$boundingBoxes array

setLocationsByCircle() public method

NOTE: The argument order is Longitude/Latitude (to match the Twitter API and GeoJSON specifications). Eg: setLocationsByCircle(array( array(144.9631, -37.8142, 30), // Melbourne, 3km radius array(-0.1262, 51.5001, 25), // London 10km radius ));
See also: setLocations()
public setLocationsByCircle ( $locations )

setSecureHostPort() public method

Set secure host port
public setSecureHostPort ( integer $port ) : void
$port integer
return void

setTrack() public method

Applies to: METHOD_FILTER See: http://apiwiki.twitter.com/Streaming-API-Documentation#TrackLimiting
public setTrack ( array $trackWords )
$trackWords array

statusUpdate() protected method

Called every $this->avgPeriod (default=60) seconds, and this default implementation calculates some rates, logs them, and resets the counters.
protected statusUpdate ( )

Property Details

$URL_BASE protected property

protected $URL_BASE

$avgElapsed protected property

Reset to zero after each call to statusUpdate() Highest value it should ever reach is $this->avgPeriod
protected int $avgElapsed
return integer

$avgPeriod protected property

protected $avgPeriod

$buff protected property

protected $buff

$conn protected property

protected $conn

$connectFailuresMax protected property

Config type vars - override in subclass if desired
protected $connectFailuresMax

$connectTimeout protected property

protected $connectTimeout

$count protected property

protected $count

$enqueueSpent protected property

Total number of seconds (fractional) spent in the enqueueStatus() calls (i.e. the customized function that handles each received tweet).
protected float $enqueueSpent
return float

$enqueueTimeMS protected property

Simply: enqueueSpent divided by statusCount Note: by default, calculated fresh for past 60 seconds, every 60 seconds.
protected float $enqueueTimeMS
return float

$fdrPool protected property

protected $fdrPool

$filterChanged protected property

State vars
protected $filterChanged

$filterCheckCount protected property

By default it is called every 5 seconds, so if doing statusUpdates every 60 seconds and then resetting it, this will usually be 12.
protected int $filterCheckCount
return integer

$filterCheckMin protected property

protected $filterCheckMin

$filterCheckSpent protected property

Total number of seconds (fractional) spent in the checkFilterPredicates() calls
protected float $filterCheckSpent
return float

$filterCheckTimeMS protected property

Like $enqueueTimeMS but for the checkFilterPredicates() function.
protected float $filterCheckTimeMS
return float

$filterUpdMin protected property

protected $filterUpdMin

$followIds protected property

@see http://dev.twitter.com/pages/streaming_api_methods#count
protected $followIds

$format protected property

protected $format

$hostPort protected property

protected $hostPort

$httpBackoff protected property

protected $httpBackoff

$httpBackoffMax protected property

protected $httpBackoffMax

$idlePeriod protected property

Number of seconds since the last tweet arrived (or the keep-alive newline)
protected int $idlePeriod
return integer

$idleReconnectTimeout protected property

protected $idleReconnectTimeout

$lastErrorMsg protected property

protected $lastErrorMsg

$lastErrorNo protected property

protected $lastErrorNo

$locationBoxes protected property

protected $locationBoxes

$maxIdlePeriod protected property

The maximum value $this->idlePeriod has reached.
protected int $maxIdlePeriod
return integer

$method protected property

protected $method

$password protected property

protected $password

$readTimeout protected property

protected $readTimeout

$reconnect protected property

protected $reconnect

$secureHostPort protected property

protected $secureHostPort

$statusCount protected property

Note: by default this is the sum for last 60 seconds, and is therefore reset every 60 seconds. To change this behaviour write a custom statusUpdate() function.
protected int $statusCount
return integer

$statusRate protected property

The number of tweets received per second in previous minute; calculated fresh just before each call to statusUpdate() I.e. if fewer than 30 tweets in last minute then this will be zero; if 30 to 90 then it will be 1, if 90 to 150 then 2, etc.
protected int $statusRate
return integer

$status_length_base protected property

protected $status_length_base

$tcpBackoff protected property

protected $tcpBackoff

$tcpBackoffMax protected property

protected $tcpBackoffMax

$trackWords protected property

protected $trackWords

$userAgent protected property

protected $userAgent

$username protected property

Member Attribs
protected $username