Plotting Statistics

litstudy.plot.plot_histogram(data: DataFrame, keys=None, title='', xlabel='', ylabel='', label_rotation=None, vertical=False, bar_width=0.8, max_label_length=100, stacked=False, legend=True, relative_to=None, ax=None)

This is the general function to plot a histogram (bar plot). All other plot_*_histogram functions in this module will call this function.

This function takes a pandas DataFrame. Each column represents one group and a sequence of bars that will be plotted. The names in the index are placed as labels on the axis.

For instance, a possible input could be data frame where the columns are different authors, rows are different years, and values are the number of publications per year per author.

Parameters:
  • ax -- The matplotlib Axes instance the plot will be drawn on. If None, the current Axes instance is used (plt.gca()).

  • vertical -- Default bars are horizontal (left to right). Set vertical=True for vertical bars (bottom to top).

  • bar_width -- Width of bars. Should be at most 1.0 for 100%.

  • label_rotation -- Rotates the xlabels. This is useful if vertical=True since it can be used to place the labels horziontal (label_rotation=0), vertical (label_rotation=90), or diagonal (label_rotation=45).

  • max_label_length -- Labels longer than this length are shortened.

  • stacked -- By default, different groups are drawn next to each other. If True, the different groups are stacked on top of each other instead.

  • legend -- Show legend.

  • relative_to -- If not None, all bars will be plotted as a percentage relative to this value.

  • title -- Title of plot (set using ax.set_title).

  • xlabel -- Title on the X axis (or Y axis if vertical=True).

  • ylabel -- Title on the Y axis (or X axis if vertical=True).

litstudy.plot.plot_year_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents published in each year.

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_author_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents published per author.

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_number_authors_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of authors per document.

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_author_affiliation_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents published per author affiliation.

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_language_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents by language.

litstudy.plot.plot_source_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents by publication source.

litstudy.plot.plot_source_type_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents by publication source type.

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_affiliation_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents by author affiliation.

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_country_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents by country of author affiliation

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_continent_histogram(docs: DocumentSet, **kwargs)

Plot histogram of the number of documents by continent of author affiliation.

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_word_distribution(corpus: Corpus, *, limit=25, **kwargs)

Plot the frequency of the top words in the given corpus.

Parameters:

kwargs -- Passed to plot_histogram.

litstudy.plot.plot_embedding(corpus: Corpus, model: TopicModel, layout=None, ax=None)

TODO

litstudy.plot.plot_topic_clouds(model: TopicModel, *, fig=None, ncols=3, **kwargs)

Plot word clouds for each of the topics from the given topic model.

litstudy.plot.plot_topic_cloud(model: TopicModel, topic_id: int, *, ax=None, **kwargs)

Plot a word cloud for the given topic from the given topic model.