core.analysis_results¶
Submodule of khiops.core
Classes to access Khiops JSON reports
Class Overview¶
Below we describe with diagrams the relationships of the classes in this modules. They are mostly compositions (has-a relations) and we omit native attributes (str, int, float, etc).
The main class of this module is AnalysisResults
and it is largely a
composition of sub-reports objects given by the following structure:
AnalysisResults
|- preparation_report -> PreparationReport
|- bivariate_preparation_report -> BivariatePreparationReport
|- modeling_report -> ModelingReport
|- train_evaluation_report |
|- test_evaluation_report |-> EvaluationReport
|- evaluation_report |
These sub-classes in turn use other tertiary classes to represent specific information
pieces of each report. The dependencies for the classes PreparationReport
and
BivariatePreparationReport
are:
PreparationReport
|- variable_statistics -> list of VariableStatistics
BivariatePreparationReport
|- variable_pair_statistics -> list of VariablePairStatistics
VariableStatistics
|- data_grid -> DataGrid
VariablePairStatistics
|- data_grid -> DataGrid
DataGrid
|- dimensions -> list of DataGridDimension
DataGridDimension
|- partition -> list of PartInterval OR
| list of PartValue OR
| list of PartValueGroup
for class ModelingReport
:
ModelingReport
|- trained_predictors -> list of TrainedPredictors
TrainedPredictor
|- selected_variables -> list of SelectedVariable
and for class EvaluationReport
:
EvaluationReport
|- predictors_performance -> list of PredictorPerformance
|- classification_lift_curves -> list of PredictorCurve (classification only)
|- regression_rec_curves -> list of PredictorCurve (regression only)
PredictorPerformance
|- confusion_matrix -> ConfusionMatrix (classification only)
To have a complete illustration of the access to the information of all classes in this
module look at their write_report
methods which write TSV (tab separated values)
reports.
Functions¶
Reads a Khiops JSON report |
Classes¶
Main class containing the information of a Khiops JSON file |
|
Bivariate data preparation report: 2D grid models |
|
A classifier's confusion matrix |
|
A piecewise constant probability density estimation |
|
A dimension (variable) of a data grid |
|
Evaluation report for predictors |
|
Modeling report of all predictors created in a supervised analysis |
|
Element of a numerical interval partition in a data grid |
|
Element of a value partition (singletons) in a data grid |
|
Element of a categorical partition in a data grid |
|
A lift curve for a classifier or a REC curve for a regressor |
|
A predictor's performance evaluation |
|
Univariate data preparation report: discretizations and groupings |
|
Information about a selected variable in a predictor |
|
Trained predictor information |
|
Variable pair information and statistics |
|
Variable information and statistics |
- class khiops.core.analysis_results.AnalysisResults(json_data=None)¶
Bases:
KhiopsJSONObject
Main class containing the information of a Khiops JSON file
Sub-reports not available in the JSON data are optional (set to
None
).- Parameters:
- json_datadict, optional
A dictionary representing the data of a Khiops JSON report file. If not specified it returns an empty instance.
Note
See also the
read_analysis_results_file
function from the core API to obtain an instance of this class from a Khiops JSON file.
- Attributes:
- toolstr
Name of the Khiops tool that generated the report.
- versionstr
Version of the Khiops tool that generated the report.
- short_descriptionstr
Short description defined by the user.
- logslist of tuples
2-tuples linking each sub-task name to a list containing the warnings and errors found during the execution of that sub-task. Available only if there were errors or warnings.
- preparation_report
PreparationReport
A report about the variables’ discretizations and groupings.
- bivariate_preparation_report
BivariatePreparationReport
, optional A report of the grid models created from pairs of variables. Available only when pair of variables were created in the analysis.
- modeling_report
ModelingReport
A report describing the predictor models. Available only in supervised analysis.
- train_evaluation_report
EvaluationReport
An evaluation report of the trained models on the train dataset split. Available only in supervised analysis.
- test_evaluation_report
EvaluationReport
An evaluation report of the trained models on the test dataset split. Available only in supervised analysis and when the test split was not empty.
- evaluation_report
EvaluationReport
An
EvaluationReport
instance for evaluations created with an explicit evaluation (either with theevaluate_predictor
core API function or the Evaluate Predictor feature of the Khiops desktop app). Available only when the report was generated with the aforementioned features.
- get_reports()¶
Returns all available sub-reports
- Returns:
- list
All available sub-reports.
- write_report(stream_or_writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- stream_or_writer
io.IOBase
orKhiopsOutputWriter
Output stream or writer.
- stream_or_writer
- write_report_file(report_file_path)¶
Writes a TSV report file with the object’s information
- Parameters:
- report_file_pathstr
Path of the output TSV report file.
- class khiops.core.analysis_results.BivariatePreparationReport(json_data=None)¶
Bases:
object
Bivariate data preparation report: 2D grid models
The attributes related to the target variable and null model are available only in the case of a supervised learning task (only classification in the bivariate case).
- Parameters:
- json_datadict, optional
JSON data of the
bivariatePreparationReport
field of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- report_type“BivariatePreparation” (only possible value)
Report type.
- dictionarystr
Name of the training data table dictionary.
- variable_typeslist of str
The different types of variables.
- variable_numberslist of int
The number of variables for each type in
variables_types
(synchronized lists).- databasestr
Path of the main training data table file.
- sample_percentageint
Percentage of instances used in training.
- sampling_modestr
Sampling mode used to split the train and datasets.
- selection_variablestr
Variable used to select instances for training.
- selection_valuestr
Value of selection_variable to select instances for training.
- instance_numberint
Number of training instances.
- learning_taskstr
- Name of the associated learning task. Possible values:
“Classification analysis”
“Regression analysis”
“Unsupervised analysis”
- target_variablestr
Target variable name in supervised analysis.
- main_target_valuestr
Main modality of the target variable in supervised case.
- target_stats_modestr
Mode of a categorical target variable.
- target_stats_mode_frequencyint
Mode frequency of a categorical target variable.
- target_valueslist of str
Values of a categorical target variable.
- target_value_frequencieslist of int
Frequencies for each value in
target_values
(synchronized lists).- evaluated_pair_numberint
Number of variable pairs evaluated.
- informative_pair_numberint
Number of informative variable pairs. A pair is considered informative if its level is greater than the sum of its components’ levels.
- variable_pair_statisticslist of
VariablePairStatistics
Statistics for each analyzed pair of variables.
- get_variable_pair_names()¶
Returns the pairs of variable names available on this report
- Returns:
- list of tuple
The pair of variable names available on this report
- get_variable_pair_statistics(variable_name_1, variable_name_2)¶
Returns the statistics of the specified pair of variables
Note
The variable names can be given in any order.
- Parameters:
- variable_name_1str
Name of the first variable.
- variable_name_2str
Name of the second variable.
- Returns:
VariablePairStatistics
The statistics of the specified pair of variables.
- Raises:
KeyError
If no pair with the specified names exist.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.ConfusionMatrix(json_data=None)¶
Bases:
object
A classifier’s confusion matrix
- Parameters:
- json_datadict, optional
JSON data of the
confusionMatrix
field of an element of the dictionary found at thepredictorsDetailedPerformances
field within one of the evaluation report fields of a Khiops JSON report file. If not specified it returns an empty object.
- Attributes:
- valueslist of str
Values of the target variable.
- matrixlist
Matrix of predicted frequencies vs target frequencies. This list is synchornized with
values
. Each list element represents a row of the confusion matrix, that is, the target frequencies for a fixed predicted target value.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.DataGrid(json_data=None)¶
Bases:
object
A piecewise constant probability density estimation
A data grid represents one or many variables referred to as “dimensions” to differentiate them from the original data variables. Each dimension can be partitioned by:
Intervals for numerical variables
Values (singletons) / Value groups for categorical variables
The Cartesian product of the unidimensional partitions provides a multivariate partition of cells whose frequencies allow to estimate the multivariate probability density.
In the univariate case, the data grid is simply an histogram. In the case of multiple variables, the data grid may be supervised or not. If supervised, the target variable is the last one, and the data grid represents the conditional density estimator of the source variable with respect to the target. Otherwise, it represents a joint density estimator.
In case of an unsupervised data grid, the cells are described by their index on the variable partitions, together with their frequencies. For a supervised data grid, the cells are described by their index on the input variables partitions, and a vector of target frequencies is associated to each cell.
- Parameters:
- json_datadict, optional
JSON data at a
dataGrid
field of an element of the list found at thevariablesDetailedStatistics
field within thepreparationReport
field of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- is_supervisedbool
True
if the data grid is supervised (there is a target).- dimensionslist of
DataGridDimension
The dimensions of the data grid.
- frequencieslist of int
Unsupervised only: Frequencies for each part.
- part_interestslist of float
Supervised univariate only: Prediction interests for each part of the input dimension. Synchronized with
dimensions[0].partition
.- part_target_frequencieslist
Supervised univariate only: List of frequencies per target value for each part of the input dimension. Synchronized with
dimensions[0].partition
.- cell_idslist of str
Multivariate only: Unique identifiers of the grid’s cells.
- cell_part_indexeslist
Multivariate only: List of dimension indexes defining each cell. Synchronized with
cell_ids
.- cell_frequencieslist of int
Unsupervised multivariate only: Frequencies for each cell. Synchronized with
cell_ids
.- cell_target_frequencieslist
Supervised multivariate only: List of frequencies per target value for each cell. Synchronized with
cell_ids
.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.DataGridDimension(json_data=None)¶
Bases:
object
A dimension (variable) of a data grid
- Parameters:
- json_datadict, optional
JSON data of an element at the
dimensions
field of adataGrid
field of an element of the list found at thevariablesDetailedStatistics
field within thepreparationReport
field of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- variablestr
Variable name
- type“Numerical” or “Categorical”
Variable type.
- partition_type“Intervals”, “Values” or “Value groups”
Partition type.
- partitionlist
- The dimension parts. The list objects are of type:
PartInterval
: Ifpartition type
is “Intervals”PartValue
: Ifpartition_type
is “Values”PartValueGroup
: Ifpartition_type
is “Value groups”
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.EvaluationReport(json_data=None)¶
Bases:
object
Evaluation report for predictors
- Parameters:
- json_datadict, optional
- JSON data of the fields:
trainEvaluationReport
: predictor trainingtestEvaluationReport
: predictor training & non-empty test splitevaluationReport
: explicit evaluation
The first two fields are set when doing a supervised analysis: either with the “Train Model” feature of the Khiops app or the
train_predictor
function of the Khiops Python core API. The third field is set when doing an explicit evaluation: either with the Evaluate Predictor feature of the Khiops app or theevaluate_predictor
function of the Khiops Python core API.If not specified it returns an empty instance.
- Attributes:
- report_type“Evaluation” (only possible value)
Report type.
- evaluation_type“Train”, “Test” or “”
Evaluation type. The value “” is set when the evaluation was explicit.
- dictionarystr
Name of the training data table dictionary.
- databasestr
Path of the main training data table file.
- sample_percentageint
Percentage of instances used in training.
- sampling_modestr
Sampling mode used to split the train and datasets.
- selection_variablestr
Variable used to select instances for training.
- selection_valuestr
Value of selection_variable to select instances for training.
- instance_numberint
Number of training instances.
- learning_task“Classification analysis” or “Regression analysis”
Type of learning task.
- target_variablestr
Name of the target variable.
- main_target_valuestr
Main value of the target variable.
- predictors_performancelist of
PredictorPerformance
Performance metrics for each predictor.
- regression_rec_curveslist of
PredictorCurve
REC curves for each regressor.
- classification_target_valueslist of str
Target variable values for which a classifier lift curve was evaluated.
- classification_lift_curveslist of
PredictorCurve
Lift curves for each target value in
classification_target_values
. The lift curve for the optimal predictor is prepended to those of the target values.
- get_classifier_lift_curve(classifier_name, target_value)¶
Returns the lift curve for the specified classifier and target value
- Parameters:
- classifier_namestr
A name of a classifier.
- target_valuestr
A specific value of the target variable.
- Returns:
PredictorCurve
The lift curve for the specified classifier and target value.
- Raises:
KeyError
If no classifier with the specified exists or no target value with the specified name exists.
- get_predictor_names()¶
Returns the names of the available predictors in the report
- Returns:
- list of str
The names of the available predictors.
- get_predictor_performance(predictor_name)¶
Returns the performance metrics for the specified predictor
- Parameters:
- predictor_namestr
A predictor name.
- Returns:
PredictorPerformance
The performance metrics for the specified predictor.
- Raises:
KeyError
If no predictor with the specified name exists.
- get_regressor_rec_curve(regressor_name)¶
Returns the REC curve for the specified regressor
- Parameters:
- regressor_namestr
Name of a regressor.
- Returns:
PredictorCurve
The REC curve for the specified regressor.
- Raises:
ValueError
If no regressor curves available. (
KeyError
If no regressor with the specified name exists.
- get_snb_lift_curve(target_value)¶
Returns lift curve for the Selective Naive Bayes clf. given a target value
- Parameters:
- target_valuestr
A specific value of the target variable.
- Returns:
PredictorCurve
The lift curve of the Selective Naive Bayes classifier for the specified target value.
- Raises:
ValueError
If the Selective Naive Bayes classifier information is not available.
KeyError
If no target value with the specified name exists.
- get_snb_performance()¶
Returns the performance metrics for the Selective Naive Bayes predictor
- Returns:
PredictorPerformance
The performance metrics for the Selective Naive Bayes predictor.
- Raises:
ValueError
If the Selective Naive Bayes information is not available in the report.
- get_snb_rec_curve()¶
Returns the REC curve for the Selective Naive Bayes regressor
- Returns:
PredictorCurve
The REC curve for the Selective Naive Bayes regressor.
- Raises:
ValueError
If the Selective Naive Bayes information is not available in the report.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer object.
- writer
- class khiops.core.analysis_results.ModelingReport(json_data=None)¶
Bases:
object
Modeling report of all predictors created in a supervised analysis
- Parameters:
- json_datadict, optional
JSON data of the
modelingReport
field of Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- report_type“Modeling” (only possible value)
Report type.
- dictionarystr
Name of the training data table dictionary.
- databasestr
Path of the main training data table file.
- sample_percentageint
Percentage of instances used in training.
- sampling_mode“Include sample” or “Exclude sample”
Sampling mode used to split the train and datasets.
- selection_variablestr
Variable used to select instances for training.
- selection_valuestr
Value of
selection_variable
to select instances for training.- learning_task“Classification analysis” or “Regression analysis”
Name of the associated learning task.
- target_variablestr
Name of the target variable.
- main_target_valuestr
Main value of the target variable.
- trained_predictorslist of
TrainedPredictor
The predictors trained in the task.
- get_predictor(predictor_name)¶
Returns the specified predictor
- Parameters:
- predictor_namestr
Name of the predictor.
- Returns:
TrainedPredictor
The predictor object for the specified name.
- Raises:
KeyError
If there is no predictor with the specified name.
- get_predictor_names()¶
Returns the names of the available predictor reports
- Returns:
- list of str
The names of the available predictor reports.
- get_snb_predictor()¶
Returns the Selective Naive Bayes predictor
- Returns:
TrainedPredictor
The predictor object for “Selective Naive Bayes”.
- Raises:
KeyError
If there is no predictor named “Selective Naive Bayes”.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.PartInterval(json_data=None)¶
Bases:
object
Element of a numerical interval partition in a data grid
- Parameters:
- json_datalist, optional
JSON data of the
partition
field of adataGrid
field of an element of the list found at thevariablesDetailedStatistics
field within thepreparationReport
field of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- lower_boundfloat
The lower bound of the interval.
- upper_boundfloat
The upper bound of the interval.
- is_missingbool
True if it is the missing values part (bounds are
None
).- is_left_openbool
True if the interval has no minimum.
lower_bound
still contains the minimum value seen on data.- is_right_openbool
True if the interval has no maximum.
upper_bound
still contains the minimum value seen on data.
- part_type()¶
Type of this part
- Returns:
- str
Only possible value: “Interval”.
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.PartValue(json_data=None)¶
Bases:
object
Element of a value partition (singletons) in a data grid
- Parameters:
- json_datastr, optional
The value contained in this singleton part. If not specified it returns an empty object.
- Attributes:
- valuestr
A representation of the value defining the singleton.
- part_type()¶
Type of the instance
- Returns:
- str
Only possible value: “Value”.
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.PartValueGroup(json_data=None)¶
Bases:
object
Element of a categorical partition in a data grid
- Parameters:
- json_datalist of str, optional
The list of values of the group. If not specified it returns an empty instance.
- Attributes:
- valueslist of str
The group’s values.
- is_default_partbool
True if this part is dedicated to all unknown values.
- part_type()¶
Type of the instance
- Returns:
- str
Only possible value: “Value group”.
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.PredictorCurve(json_data=None)¶
Bases:
object
A lift curve for a classifier or a REC curve for a regressor
- Parameters:
- json_datadict, optional
JSON data of an element of the
liftCurves
orrecCurves
field of one of the evaluation report fields of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- type“Lift” (classifier) or “REC” (regressor)
Type of predictor curve.
- namestr
Name of evaluated predictor.
- valueslist of float
The curve’s y-axis values.
- class khiops.core.analysis_results.PredictorPerformance(json_data=None)¶
Bases:
object
A predictor’s performance evaluation
This class describes the performance of a predictor (classifier or regressor).
- Parameters:
- json_datadict, optional
JSON data of an element of the dictionary found at the
predictorPerformances
field within the one of the evaluation report fields of a Khiops JSON report file. If not specified it returns an empty instance.Note
The
confusion_matrix
field is considered as “detail” and is not initialized in the constructor. Instead, it is initialized explicitly via theinit_details
method. This allows to make partial initializations for large reports.
- Attributes:
- rankstr
An string index representing the order in the report.
- type“Classifier” or “Regressor”
Type of the predictor.
- namestr
Human readable name.
- data_grid
DataGrid
Data grid representing the distribution of the target values per part of the descriptive variable in the evaluated dataset.
- accuracyfloat
Classifier only: Accuracy.
- compressionfloat
Classifier only: Compression rate.
- aucfloat
Classifier only: Area under the ROC curve.
- confusion_matrixConfusionMatrix
Classifier only: Confusion matrix.
- rmsefloat
Regressor only: Root mean square error.
- maefloat
Regressor only: Mean absolute error.
- nlpdfloat
Regressor only: Negative log predictive density.
- rank_rmsefloat
Regressor only: Root mean square error on the target’s value rank.
- rank_maefloat
Regressor only: Mean absolute error on the target’s value rank.
- rank_nlpdfloat
Regressor only: Negative log predictive density on the target’s value rank.
- get_metric(metric_name)¶
Returns the value of the specified metric
Note
The available metrics is available via the method
get_metric_names
.- Parameters:
- metric_namestr
A metric name (case insensitive).
- Returns:
- float
The value of the specified metric.
- get_metric_names()¶
Returns the available univariate metrics
- Returns:
- list of str
The names of the available metrics.
- init_details(json_data=None)¶
Initializes the details’ attributes from a python JSON object
- is_detailed()¶
Returns True if the report contains any detailed information
- Returns:
- bool
True if the report contains any detailed information.
- write_report_details(writer)¶
Writes the details of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.PreparationReport(json_data=None)¶
Bases:
object
Univariate data preparation report: discretizations and groupings
The attributes related to the target variable and null model are available only in the case of a supervised learning task (classification or regression).
- Parameters:
- json_datadict, optional
JSON data of the
preparationReport
field of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- report_type“Preparation” (only possible value)
Report type.
- dictionarystr
Name of the training data table dictionary.
- variable_typeslist of str
The different types of variables.
- variable_numberslist of int
Number of variables for each type. Synchronized with
variable_types
.- databasestr
Path of the main training data table file.
- sample_percentageint
Percentage of instances used in training.
- sampling_modestr
Sampling mode used to split the train and datasets.
- selection_variablestr
Name of the variable used to select training instances.
- selection_valuestr
Value of
selection_variable
to select training instance.- instance_numberint
Number of training instances.
- learning_taskstr
- Name of the associated learning task. Possible values:
“Classification analysis”
“Regression analysis”
“Unsupervised analysis”
- target_variablestr
Target variable name.
- main_target_valuestr
Main value of a categorical target variable.
- target_stats_minfloat
Minimum of a numerical target variable.
- target_stats_maxfloat
Maximum of a numerical target variable.
- target_stats_meanfloat
Mean of a numerical target variable.
- target_stats_std_devfloat
Standard deviation of a numerical target variable.
- target_stats_missing_numberint
Number of missing values for a numerical target variable.
- target_stats_modestr
Mode of a categorical target variable.
- target_stats_mode_frequencyint
Mode frequency of a categorical target variable.
- target_valueslist of str
Values of a categorical target variable.
- target_value_frequencieslist of int
Frequencies for each target value. Synchronized with
target_values
.- evaluated_variable_numberint
Number of variables analyzed.
- informative_variable_numberint
Supervised analysis only: Number of informative variables.
- max_constructed_variablesint
Maximum number of constructed variable specified for the analysis.
- max_treesint
Maximum number of constructed trees specified for the analysis.
- max_pairsint
Maximum number of constructed variables pairs specified for the analysis.
- discretizationstr
Type of discretization method used.
- value_groupingstr
Type of grouping method used.
- null_model_construction_costfloat
Coding length of the null construction model.
- null_model_preparation_costfloat
Coding length of the null preparation model.
- null_model_data_costfloat
Coding length of the data given the null model.
- variables_statisticslist of
VariableStatistics
Variable statistics for each variable analyzed.
- get_variable_names()¶
Returns the names of the variables analyzed during the preparation
- Returns:
- list of str
The names of the variables analyzed during the preparation.
- get_variable_statistics(variable_name)¶
Returns the statistics of the specified variable
- Parameters:
- variable_namestr
Name of the variable.
- Returns:
VariableStatistics
The statistics of the specified variable.
- Raises:
KeyError
If no variable with the specified names exist.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.SelectedVariable(json_data=None)¶
Bases:
object
Information about a selected variable in a predictor
- Parameters:
- json_datadict, optional
JSON data representing an element of the
selectedVariables
list in thetrainedPredictorsDetails
field within themodelingReport
field of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- namestr
Human readable variable name.
- prepared_namestr
Internal variable name.
- levelfloat
Variable level.
- weightfloat
Variable weight in the model.
- importancefloat
A measure of overall importance of the variable in the model. It is the geometric mean of the level and weight.
- mapbool
True if the variable is in the MAP model. Deprecated: Will be removed in Khiops Python 11.
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.TrainedPredictor(json_data=None)¶
Bases:
object
Trained predictor information
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
trainedPredictors
field within themodelingReport
field of a Khiops JSON report file. If not specified it returns an empty instance.Note
The
selected_variables
field is considered a “detail” and is not initialized in the constructor. Instead, it is initialized explicitly via theinit_details
method. This allows to make partial initializations for large reports.
- Attributes:
- typestr
Predictor type. Valid values are found in the
predictor_types
class attribute. They are:“Selective Naive Bayes”
“MAP Naive Bayes” Deprecated
“Naive Bayes”
“Univariate”
- family“Classifier” or “Regressor”
Predictor family name. Valid values are found in the
predictor_families
class variable.- namestr
Human readable predictor name.
- variable_numberint
Number of variables used by the predictor.
- selected_variableslist of
SelectedVariable
Variables used by the predictor. Only for types “Selective Naive Bayes” and “MAP Naive Bayes”.
- init_details(json_data=None)¶
Initializes the details’ attributes from a Python JSON object
- Parameters:
- json_datadict, optional
JSON data of the dictionary found at the
trainedPredictorsDetails
field within themodelingReport
field of a Khiops JSON report file. If not specified it leaves the object as-is.
- is_detailed()¶
Returns True if the report contains any detailed information
- Returns:
- bool
True if the report contains any detailed information.
- write_report_details(writer)¶
Writes the details of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.VariablePairStatistics(json_data=None)¶
Bases:
object
Variable pair information and statistics
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
variablesPairStatistics
field within thebivariatePreparationReport
field of a Khiops JSON report file. If not specified it returns an empty instance.Note
The
data_grid
field is considered as “detail” and is not initialized in the constructor. Instead, it is initialized explicitly via theinit_details
method. This allows to make partial initializations for large reports. If not specified it returns an empty instance.
- Attributes:
- rankstr
Variable rank with respect to its level. Lower Rank = Higher Level.
- name1str
Name of the pair’s first variable.
- name2str
Name of the pair’s second variable.
- levelfloat
Predictive importance of the pair.
- level1float
Predictive importance of the first variable.
- level2float
Predictive importance of the second variable.
- delta_levelfloat
Difference between the pair’s level and the sum of those of its components (
delta_level = level - level1 - level2
).- variable_numberint
- Number of active variables in the pair:
0 means that there is no information in any of the variables
1 means that the pair information reduces to that of any of its components
2 means that the two variables are jointly informative
- part_number1int
Number of parts of the first variable partition.
- part_number2int
Number of parts of the second variable partition.
- cell_numberint
Number of cells generated of the pair grid.
- construction_costfloat
Advanced: Construction cost of the variable. More complex variables cost more.
- preparation_costfloat
Advanced: Partition model cost. More complex partitions cost more.
- data_costfloat
Advanced: Negative log-likelihood of the variable given a preparation model and a construction model.
- data_grid
DataGrid
A density estimation of the partitioned pair of variable with respect to the target.
- init_details(json_data=None)¶
Initializes the details’ attributes from a Python JSON object
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
variablesPairsDetailedStatistics
field within thebivariatePreparationReport
field of a Khiops JSON report file. If not specified it leaves the object as-is.
- is_detailed()¶
Returns True if the report contains any detailed information
- Returns:
- bool
True if the report contains any detailed information.
- write_report_details(writer)¶
Writes the details’ attributes into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- class khiops.core.analysis_results.VariableStatistics(json_data=None)¶
Bases:
object
Variable information and statistics
Note
The statistics in this class are for both numerical and categorical data.
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
variablesStatistics
field within thepreparationReport
field of a Khiops JSON report file. If not specified it returns an empty instance.Note
The
data_grid
field is considered a “detail” and is not initialized in the constructor. Instead, it is initialized explicitly via theinit_details
method. This allows to make partial initializations for large reports. If not specified it returns an empty instance.
- Attributes:
- rankstr
Variable rank with respect to its level. Lower Rank = Higher Level.
- namestr
Variable name.
- typestr
- Variable type. Valid values:
“Numerical”
“Categorical”
“Date”
“Time”
“Timestamp”
“Table”
“Entity”
“Structure”
- levelfloat
Variable predictive importance.
- target_part_numberint
In regression: Number of the target intervals
In classification with target grouping: Number of target groups
- part_numberint
Number of parts of the variable partition.
- value_numberint
Number of distinct values of the variable.
- minfloat
Minimum value of the variable.
- maxfloat
Maximum value of the variable.
- meanfloat
Mean value of the variable.
- std_devfloat
Standard deviation of the variable.
- missing_numberint
Number of missing values of the variable.
- modefloat
Most common value.
- mode_frequencyint
Frequency of the most common value.
- input_valueslist of str
Different values taken by the variable. If there are too many values only the more frequent will be available.
- input_value_frequencieslist of int
The frequencies for each input value. Synchronized with
input_values
.- construction_costfloat
Construction cost of the variable. More complex variables cost more.
- preparation_costfloat
Partition model cost. More complex partitions cost more.
- data_costfloat
Negative log-likelihood of the variable given a preparation model and a construction model.
- derivation_rulestr
If the variable is not native it is Khiops dictionary function to derive it. Otherwise is set to
None
.- data_grid
DataGrid
A density estimation of the partitioned variable with respect to the target.
- init_details(json_data=None)¶
Initializes the details’ attributes from a Python JSON object
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
variablesDetailedStatistics
field within thepreparationReport
field of a Khiops JSON report file. If not specified it leaves the object as-is.
- is_detailed()¶
Returns True if the report contains any detailed information
- Returns:
- bool
True if the report contains any detailed information.
- write_report_details(writer)¶
Writes the details’ attributes into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter
Output writer.
- writer
- khiops.core.analysis_results.read_analysis_results_file(json_file_path)¶
Reads a Khiops JSON report
- Parameters:
- json_file_pathstr
Path of the JSON report file.
- Returns:
AnalysisResults
An instance of AnalysisResults containing the report’s information.
Examples
- See the following functions of the
samples.py
documentation script: