Modules Documentation

Traffic Requester Module

class lib.traffic_requester.TrafficRequester(config, prefix='settings_standard', verbose=False)

Bases: object

traffic requester initialization

Parameters
  • config (configparser file) – configuration file

  • prefix (string) – name for log file

  • verbose (bool) – print verbose debugging statements

get_history()

requests traffic history for each repository

Then adds all information to the dataframe

get_repositories()

api request for repositories

checks which repositories are owned by the user or to which the user has contributed. Adds all of these repo names to the dataframe.

log_data()

save raw data to log file

run()

main run function for traffic requester

Analytics Module

class lib.analytics.Analytics(prefix='settings_standard', verbose=False)

Bases: object

analytics initialization

Parameters
  • prefix (string) – name for log file

  • verbose (bool) – print verbose debugging statements

check_dirs()

check and create directories

create log directories if they don’t yet exist and check which raw logs need to be analyzed.

Returns

analytics_needed – the raw logs that do not yet have a corresponding analytics directory

Return type

list

check_forks_change()

Checks forks counts

checks whether the forks count has changed and appends any changes to self.forks_change

check_stars_change()

Checks start counts

checks whether the stars count has changed and appends any changes to self.stars_change

check_tracking_change()

check tracked repositories

checks which repositories are beginning to be tracked or have stopped being tracked.

create_repo_dirs()

create log directories if they don’t yet exist

full2dir(fullname)

changes full repository name into a directory name

Parameters

fullname (string) – full repository name

Returns

dirname – new directory name

Return type

string

load_log()

load_log file into dataframe

log_analytics()

Logs the analytics to a json file

run()

main run function for analytics

sort_raw_data()

Sort through each of the main metrics for each repository

update_daily_metric(ri, col_name)

update metrics that are daily

this function reads through the old data and only adds new daily values

Parameters
  • ri (int) – row of dataframe to read from

  • col_name (string) – column name and thus file name for the specific metric

update_nondaily_metric(ri, col_name)

update nondaily metrics

update metrics that are not daily, this function simply appends the newest value to the log file

Parameters
  • ri (int) – row of dataframe to read from

  • col_name (string) – column name and thus file name for the specific metric

Plotter Module

class lib.plotter.Plotter(prefix='settings_standard')

Bases: object

Plotter class.

Parameters

prefix (string) – name for log file

create_email_plots(date_cur, date_prev=None)

create and save some plots for use in an email

Parameters
  • date_cur (string) – YYYY-MM-DD, date of current analytics file

  • date_prev (string) – YYYY-MM-DD, date of previous analytics file

Returns

fig_paths – [string,string,…]) : list of strings of the location of where each figure is saved

Return type

list

create_plots(verbose=False)

create a bunch of plots as desired

Parameters

verbose (bool) – print verbose debugging statements

plot_daily_metrics(col_name, type='daily', top_num=None, date_filter=None)

plot and daily metrics.

The plots get saved to default location if there is no date filter implmented

Parameters
  • col_name (string) – name for filename and column name

  • type (string) – either “cumsum” or “daily”. “cumsum” will plot the cumulative sum of the column over time while “daily” will plot the daily change over time

  • top_num (int) – number of top repositories (according to cumulative sum) to show in the graph. Repos with a cumulative value of 0 will still not be plotted

  • date_filter (string) – “YYYY-MM-DD”, all data after this date (inclusive) will be plotted. None means all data will be plotted

Returns

fig – new figure

Return type

matplotlib figure

plot_repo_metric(repo_dir, metric_name, type)

plots individual repository metrics and saves the plots

Parameters
  • repo_dir (string) – filepath to the repository logs

  • metric_name (string) – name for metric and column name

  • type (string) – either “cumsum” or “daily”. “cumsum” will plot the cumulative sum of the column over time while “daily” will plot the daily change over time

save_and_close(fig, plt_file)

saves and closes the figure

Parameters
  • fig (matplotlib fig) – figure object

  • plt_file (string) – filepath for the figure

update_repo_plots(verbose=False)

update all repo plots.

This function in particular takes a long amount of time. You could not call this function if for some reason you need to run this code faster

Parameters

verbose (bool) – print verbose debugging statements

Email Sender Module

class lib.email_sender.EmailSender(config, prefix, verbose=False)

Bases: object

email sender initialization

Parameters
  • config (configparser file) – configuration file

  • prefix (string) – name for log file

  • verbose (bool) – print verbose debugging statements

build_html_message()

Build HTML message

create the bulk of the html message by combing lots of strings together that include tracked analytics and plots that were created

Returns

msg – long string that contains the html message

Return type

string

build_service()

builds gmail api service.

Code copied with minor edits from https://developers.google.com/gmail/api/quickstart/python

Returns

service – gmail api service

Return type

gmail api

create_mixed_message(message_html)

Create a message for an email.

Copied with edits from https://developers.google.com/gmail/api/guides/sending Also see this answer for how to add attachments https://stackoverflow.com/questions/1633109/

Parameters

message_html (string) – html text message to be sent

Returns

msg_object – email object

Return type

base64url encoded email object

prep_attachments()

Prepare attachements.

call the plotter function and correlate figure names with the figures that were created

run()

main run function for the email sender

send_message(service, user_id, message)

Send an email message.

Copied with minor edits from https://developers.google.com/gmail/api/guides/sending

Parameters
  • service (Gmail API service instance) – Authorized Gmail API

  • user_id (string) – User’s email address. The special value of “me” can be used to indicate the authenticated user.

  • message (string) – Message to be sent.

Returns

message – the sent message

Return type

message object