Analysis_convert

Analysis

class logio.core.Analysis[source]

Bases: object

” This class reads and performs basic data analysis. It houses five methods with their functionalities bordering on:

reading of files {csv, xlsx, json, las};
returning a statistical description of the file;
returning a correlation matrix of the numerical features;
returning a cross tab of categorical features specified;
returning a frequency table of categorical features specified.
__init__()[source]

No global attribute to initialize.

correlate(data, method: str = 'pearson')[source]

Compute pairwise correlation of columns in a dataframe or lasio.las.LASFile, excluding NA/null and non-numerical values.

Parameters
  • data

  • method ({'pearson', 'kendall', 'spearman'}) –

    Method of correlation:

    • pearson : standard correlation coefficient

    • kendall : Kendall Tau correlation coefficient

    • spearman : Spearman rank correlation

Returns

Correlation matrix.

Return type

DataFrame

cross_tab(data, row, column)[source]

returns a crosstab for only two selected categorical feature in the dataset.

dataDataset in csv,xlsx, json or las format.

The dataset to extract the crosstab of the selected categorical features (columns)

row : The first selected categorical features(column)

column: The second selected categorical feature(column)

return a crosstab for two selected categorical feature in the dataset.

describe(data)[source]

this function returns the mean,median,min,max,skewness,kurtosis and Jarque Bera of numerical columns of a dataset.

Jarque Bera test will return NaN for columns with missing values

data: The dataset to be analysed in csv,xlsx or las format.

returns the mean,median,min,max,skewness,kurtosis and Jarque Bera summary of a dataset.

frequencyTable(data: Union[DataFrame, Series, Any], col_names: Union[str, List[str]]) Union[DataFrame, List[DataFrame]][source]

Computes a frequency table for the categorical variables in the dataframe.

params:

data -> pandas dataframe that the frequency table is computed from.

`col_names -> a string or a list of strings indicating the columns that

contains the categorical data`

Returns

a pandas dataframe that contains the frequency table.

read_file(filename: str) DataFrame[source]

reads in file in different supported formats into a DataFrame

params:

filename -> the name of the file to read

Returns

A dataframe

FileConverter

class logio.core.FileConverter(filename, output_format)[source]

Bases: object

This class converts files from one format to another. The allowed extensions are tailored towards possible well-log data formats:

|csv, |xlsx, |json |las

__init__(filename, output_format)[source]

Initialize filename and output_format attribute.

convert_file()[source]

This method takes in a data format, and returns the data in a format specified by the user, in the current working directory.

input_format : the input file format. output_format : the specified output file format (‘csv, xlsx, json, las’)

The csv format is the central format: for any conversion from an input_format to an output_format other than csv, the input_format is first converted to a csv, and afterwards, converted to the required output file format.