core.analysis_results¶
Submodule of khiops.core
Classes to access Khiops JSON reports
Class Overview¶
Below we describe with diagrams the relationships of the classes in this modules. They are mostly compositions (has-a relations) and we omit native attributes (str, int, float, etc).
The main class of this module is AnalysisResults and it is largely a
composition of sub-reports objects given by the following structure:
AnalysisResults
|- preparation_report -> PreparationReport
|- bivariate_preparation_report -> BivariatePreparationReport
|- modeling_report -> ModelingReport
|- train_evaluation_report |
|- test_evaluation_report |-> EvaluationReport
|- evaluation_report |
These sub-classes in turn use other tertiary classes to represent specific information
pieces of each report. The dependencies for the classes PreparationReport and
BivariatePreparationReport are:
PreparationReport
|- variable_statistics -> list of VariableStatistics
BivariatePreparationReport
|- variable_pair_statistics -> list of VariablePairStatistics
VariableStatistics
|- data_grid -> DataGrid
VariablePairStatistics
|- data_grid -> DataGrid
DataGrid
|- dimensions -> list of DataGridDimension
DataGridDimension
|- partition -> list of PartInterval OR
| list of PartValue OR
| list of PartValueGroup
for class ModelingReport:
ModelingReport
|- trained_predictors -> list of TrainedPredictors
TrainedPredictor
|- selected_variables -> list of SelectedVariable
and for class EvaluationReport:
EvaluationReport
|- predictors_performance -> list of PredictorPerformance
|- classification_lift_curves -> list of PredictorCurve (classification only)
|- regression_rec_curves -> list of PredictorCurve (regression only)
PredictorPerformance
|- confusion_matrix -> ConfusionMatrix (classification only)
To have a complete illustration of the access to the information of all classes in this
module look at their write_report methods which write TSV (tab separated values)
reports.
Functions¶
Reads a Khiops JSON report |
Classes¶
Main class containing the information of a Khiops JSON file |
|
Bivariate data preparation report: 2D grid models |
|
A classifier's confusion matrix |
|
A piecewise constant probability density estimation |
|
A dimension (variable) of a data grid |
|
Evaluation report for predictors |
|
Modeling report of all predictors created in a supervised analysis |
|
Element of a numerical interval partition in a data grid |
|
Element of a value partition (singletons) in a data grid |
|
Element of a categorical partition in a data grid |
|
A lift curve for a classifier or a REC curve for a regressor |
|
A predictor's performance evaluation |
|
Univariate data preparation report: discretizations and groupings |
|
Information about a selected variable in a predictor |
|
Trained predictor information |
|
Variable pair information and statistics |
|
Variable information and statistics |
- class khiops.core.analysis_results.AnalysisResults(json_data=None)¶
Bases:
KhiopsJSONObjectMain class containing the information of a Khiops JSON file
Sub-reports not available in the JSON data are optional (set to
None).- Parameters:
- json_datadict, optional
A dictionary representing the data of a Khiops JSON report file. If not specified it returns an empty instance.
Note
See also the
read_analysis_results_filefunction from the core API to obtain an instance of this class from a Khiops JSON file.
- Attributes:
- toolstr
Name of the Khiops tool that generated the report.
- versionstr
Version of the Khiops tool that generated the report.
- short_descriptionstr
Short description defined by the user.
- logslist of tuples
2-tuples linking each sub-task name to a list containing the warnings and errors found during the execution of that sub-task. Available only if there were errors or warnings.
- preparation_report
PreparationReport A report about the variables’ discretizations and groupings.
- bivariate_preparation_report
BivariatePreparationReport, optional A report of the grid models created from pairs of variables. Available only when pair of variables were created in the analysis.
- modeling_report
ModelingReport A report describing the predictor models. Available only in supervised analysis.
- train_evaluation_report
EvaluationReport An evaluation report of the trained models on the train dataset split. Available only in supervised analysis.
- test_evaluation_report
EvaluationReport An evaluation report of the trained models on the test dataset split. Available only in supervised analysis and when the test split was not empty.
- evaluation_report
EvaluationReport An
EvaluationReportinstance for evaluations created with an explicit evaluation (either with theevaluate_predictorcore API function or the Evaluate Predictor feature of the Khiops desktop app). Available only when the report was generated with the aforementioned features.
- get_reports()¶
Returns all available sub-reports
- Returns:
- list
All available sub-reports.
- write_report(stream_or_writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- stream_or_writer
io.IOBaseorKhiopsOutputWriter Output stream or writer.
- stream_or_writer
- write_report_file(report_file_path)¶
Writes a TSV report file with the object’s information
- Parameters:
- report_file_pathstr
Path of the output TSV report file.
- class khiops.core.analysis_results.BivariatePreparationReport(json_data=None)¶
Bases:
objectBivariate data preparation report: 2D grid models
The attributes related to the target variable and null model are available only in the case of a supervised learning task (only classification in the bivariate case).
- Parameters:
- json_datadict, optional
JSON data of the
bivariatePreparationReportfield of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- report_type“BivariatePreparation” (only possible value)
Report type.
- dictionarystr
Name of the training data table dictionary.
- variable_typeslist of str
The different types of variables.
- variable_numberslist of int
The number of variables for each type in
variables_types(synchronized lists).- databasestr
Path of the main training data table file.
- sample_percentageint
Percentage of instances used in training.
- sampling_modestr
Sampling mode used to split the train and datasets.
- selection_variablestr
Variable used to select instances for training.
- selection_valuestr
Value of selection_variable to select instances for training.
- instance_numberint
Number of training instances.
- learning_taskstr
- Name of the associated learning task. Possible values:
“Classification analysis”
“Regression analysis”
“Unsupervised analysis”
- target_variablestr
Target variable name in supervised analysis.
- main_target_valuestr
Main modality of the target variable in supervised case.
- target_stats_modestr
Mode of a categorical target variable.
- target_stats_mode_frequencyint
Mode frequency of a categorical target variable.
- target_valueslist of str
Values of a categorical target variable.
- target_value_frequencieslist of int
Frequencies for each value in
target_values(synchronized lists).- evaluated_pair_numberint
Number of variable pairs evaluated.
- informative_pair_numberint
Number of informative variable pairs. A pair is considered informative if its level is greater than the sum of its components’ levels.
- variable_pair_statisticslist of
VariablePairStatistics Statistics for each analyzed pair of variables.
- get_variable_pair_names()¶
Returns the pairs of variable names available on this report
- Returns:
- list of tuple
The pair of variable names available on this report
- get_variable_pair_statistics(variable_name_1, variable_name_2)¶
Returns the statistics of the specified pair of variables
Note
The variable names can be given in any order.
- Parameters:
- variable_name_1str
Name of the first variable.
- variable_name_2str
Name of the second variable.
- Returns:
VariablePairStatisticsThe statistics of the specified pair of variables.
- Raises:
KeyErrorIf no pair with the specified names exist.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.ConfusionMatrix(json_data=None)¶
Bases:
objectA classifier’s confusion matrix
- Parameters:
- json_datadict, optional
JSON data of the
confusionMatrixfield of an element of the dictionary found at thepredictorsDetailedPerformancesfield within one of the evaluation report fields of a Khiops JSON report file. If not specified it returns an empty object.
- Attributes:
- valueslist of str
Values of the target variable.
- matrixlist
Matrix of predicted frequencies vs target frequencies. This list is synchornized with
values. Each list element represents a row of the confusion matrix, that is, the target frequencies for a fixed predicted target value.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.DataGrid(json_data=None)¶
Bases:
objectA piecewise constant probability density estimation
A data grid represents one or many variables referred to as “dimensions” to differentiate them from the original data variables. Each dimension can be partitioned by:
Intervals for numerical variables
Values (singletons) / Value groups for categorical variables
The Cartesian product of the unidimensional partitions provides a multivariate partition of cells whose frequencies allow to estimate the multivariate probability density.
In the univariate case, the data grid is simply an histogram. In the case of multiple variables, the data grid may be supervised or not. If supervised, the target variable is the last one, and the data grid represents the conditional density estimator of the source variable with respect to the target. Otherwise, it represents a joint density estimator.
In case of an unsupervised data grid, the cells are described by their index on the variable partitions, together with their frequencies. For a supervised data grid, the cells are described by their index on the input variables partitions, and a vector of target frequencies is associated to each cell.
- Parameters:
- json_datadict, optional
JSON data at a
dataGridfield of an element of the list found at thevariablesDetailedStatisticsfield within thepreparationReportfield of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- is_supervisedbool
Trueif the data grid is supervised (there is a target).- dimensionslist of
DataGridDimension The dimensions of the data grid.
- frequencieslist of int
Unsupervised only: Frequencies for each part.
- part_interestslist of float
Supervised univariate only: Prediction interests for each part of the input dimension. Synchronized with
dimensions[0].partition.- part_target_frequencieslist
Supervised univariate only: List of frequencies per target value for each part of the input dimension. Synchronized with
dimensions[0].partition.- cell_idslist of str
Multivariate only: Unique identifiers of the grid’s cells.
- cell_part_indexeslist
Multivariate only: List of dimension indexes defining each cell. Synchronized with
cell_ids.- cell_frequencieslist of int
Unsupervised multivariate only: Frequencies for each cell. Synchronized with
cell_ids.- cell_target_frequencieslist
Supervised multivariate only: List of frequencies per target value for each cell. Synchronized with
cell_ids.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.DataGridDimension(json_data=None)¶
Bases:
objectA dimension (variable) of a data grid
- Parameters:
- json_datadict, optional
JSON data of an element at the
dimensionsfield of adataGridfield of an element of the list found at thevariablesDetailedStatisticsfield within thepreparationReportfield of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- variablestr
Variable name
- type“Numerical” or “Categorical”
Variable type.
- partition_type“Intervals”, “Values” or “Value groups”
Partition type.
- partitionlist
- The dimension parts. The list objects are of type:
PartInterval: Ifpartition typeis “Intervals”PartValue: Ifpartition_typeis “Values”PartValueGroup: Ifpartition_typeis “Value groups”
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.EvaluationReport(json_data=None)¶
Bases:
objectEvaluation report for predictors
- Parameters:
- json_datadict, optional
- JSON data of the fields:
trainEvaluationReport: predictor trainingtestEvaluationReport: predictor training & non-empty test splitevaluationReport: explicit evaluation
The first two fields are set when doing a supervised analysis: either with the “Train Model” feature of the Khiops app or the
train_predictorfunction of the Khiops Python core API. The third field is set when doing an explicit evaluation: either with the Evaluate Predictor feature of the Khiops app or theevaluate_predictorfunction of the Khiops Python core API.If not specified it returns an empty instance.
- Attributes:
- report_type“Evaluation” (only possible value)
Report type.
- evaluation_type“Train”, “Test” or “”
Evaluation type. The value “” is set when the evaluation was explicit.
- dictionarystr
Name of the training data table dictionary.
- databasestr
Path of the main training data table file.
- sample_percentageint
Percentage of instances used in training.
- sampling_modestr
Sampling mode used to split the train and datasets.
- selection_variablestr
Variable used to select instances for training.
- selection_valuestr
Value of selection_variable to select instances for training.
- instance_numberint
Number of training instances.
- learning_task“Classification analysis” or “Regression analysis”
Type of learning task.
- target_variablestr
Name of the target variable.
- main_target_valuestr
Main value of the target variable.
- predictors_performancelist of
PredictorPerformance Performance metrics for each predictor.
- regression_rec_curveslist of
PredictorCurve REC curves for each regressor.
- classification_target_valueslist of str
Target variable values for which a classifier lift curve was evaluated.
- classification_lift_curveslist of
PredictorCurve Lift curves for each target value in
classification_target_values. The lift curve for the optimal predictor is prepended to those of the target values.
- get_classifier_lift_curve(classifier_name, target_value)¶
Returns the lift curve for the specified classifier and target value
- Parameters:
- classifier_namestr
A name of a classifier.
- target_valuestr
A specific value of the target variable.
- Returns:
PredictorCurveThe lift curve for the specified classifier and target value.
- Raises:
KeyErrorIf no classifier with the specified exists or no target value with the specified name exists.
- get_predictor_names()¶
Returns the names of the available predictors in the report
- Returns:
- list of str
The names of the available predictors.
- get_predictor_performance(predictor_name)¶
Returns the performance metrics for the specified predictor
- Parameters:
- predictor_namestr
A predictor name.
- Returns:
PredictorPerformanceThe performance metrics for the specified predictor.
- Raises:
KeyErrorIf no predictor with the specified name exists.
- get_regressor_rec_curve(regressor_name)¶
Returns the REC curve for the specified regressor
- Parameters:
- regressor_namestr
Name of a regressor.
- Returns:
PredictorCurveThe REC curve for the specified regressor.
- Raises:
ValueErrorIf no regressor curves available. (
KeyErrorIf no regressor with the specified name exists.
- get_snb_lift_curve(target_value)¶
Returns lift curve for the Selective Naive Bayes clf. given a target value
- Parameters:
- target_valuestr
A specific value of the target variable.
- Returns:
PredictorCurveThe lift curve of the Selective Naive Bayes classifier for the specified target value.
- Raises:
ValueErrorIf the Selective Naive Bayes classifier information is not available.
KeyErrorIf no target value with the specified name exists.
- get_snb_performance()¶
Returns the performance metrics for the Selective Naive Bayes predictor
- Returns:
PredictorPerformanceThe performance metrics for the Selective Naive Bayes predictor.
- Raises:
ValueErrorIf the Selective Naive Bayes information is not available in the report.
- get_snb_rec_curve()¶
Returns the REC curve for the Selective Naive Bayes regressor
- Returns:
PredictorCurveThe REC curve for the Selective Naive Bayes regressor.
- Raises:
ValueErrorIf the Selective Naive Bayes information is not available in the report.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer object.
- writer
- class khiops.core.analysis_results.ModelingReport(json_data=None)¶
Bases:
objectModeling report of all predictors created in a supervised analysis
- Parameters:
- json_datadict, optional
JSON data of the
modelingReportfield of Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- report_type“Modeling” (only possible value)
Report type.
- dictionarystr
Name of the training data table dictionary.
- databasestr
Path of the main training data table file.
- sample_percentageint
Percentage of instances used in training.
- sampling_mode“Include sample” or “Exclude sample”
Sampling mode used to split the train and datasets.
- selection_variablestr
Variable used to select instances for training.
- selection_valuestr
Value of
selection_variableto select instances for training.- learning_task“Classification analysis” or “Regression analysis”
Name of the associated learning task.
- target_variablestr
Name of the target variable.
- main_target_valuestr
Main value of the target variable.
- trained_predictorslist of
TrainedPredictor The predictors trained in the task.
- get_predictor(predictor_name)¶
Returns the specified predictor
- Parameters:
- predictor_namestr
Name of the predictor.
- Returns:
TrainedPredictorThe predictor object for the specified name.
- Raises:
KeyErrorIf there is no predictor with the specified name.
- get_predictor_names()¶
Returns the names of the available predictor reports
- Returns:
- list of str
The names of the available predictor reports.
- get_snb_predictor()¶
Returns the Selective Naive Bayes predictor
- Returns:
TrainedPredictorThe predictor object for “Selective Naive Bayes”.
- Raises:
KeyErrorIf there is no predictor named “Selective Naive Bayes”.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.PartInterval(json_data=None)¶
Bases:
objectElement of a numerical interval partition in a data grid
- Parameters:
- json_datalist, optional
JSON data of the
partitionfield of adataGridfield of an element of the list found at thevariablesDetailedStatisticsfield within thepreparationReportfield of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- lower_boundfloat
The lower bound of the interval.
- upper_boundfloat
The upper bound of the interval.
- is_missingbool
True if it is the missing values part (bounds are
None).- is_left_openbool
True if the interval has no minimum.
lower_boundstill contains the minimum value seen on data.- is_right_openbool
True if the interval has no maximum.
upper_boundstill contains the minimum value seen on data.
- part_type()¶
Type of this part
- Returns:
- str
Only possible value: “Interval”.
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.PartValue(json_data=None)¶
Bases:
objectElement of a value partition (singletons) in a data grid
- Parameters:
- json_datastr, optional
The value contained in this singleton part. If not specified it returns an empty object.
- Attributes:
- valuestr
A representation of the value defining the singleton.
- part_type()¶
Type of the instance
- Returns:
- str
Only possible value: “Value”.
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.PartValueGroup(json_data=None)¶
Bases:
objectElement of a categorical partition in a data grid
- Parameters:
- json_datalist of str, optional
The list of values of the group. If not specified it returns an empty instance.
- Attributes:
- valueslist of str
The group’s values.
- is_default_partbool
True if this part is dedicated to all unknown values.
- part_type()¶
Type of the instance
- Returns:
- str
Only possible value: “Value group”.
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.PredictorCurve(json_data=None)¶
Bases:
objectA lift curve for a classifier or a REC curve for a regressor
- Parameters:
- json_datadict, optional
JSON data of an element of the
liftCurvesorrecCurvesfield of one of the evaluation report fields of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- type“Lift” (classifier) or “REC” (regressor)
Type of predictor curve.
- namestr
Name of evaluated predictor.
- valueslist of float
The curve’s y-axis values.
- class khiops.core.analysis_results.PredictorPerformance(json_data=None)¶
Bases:
objectA predictor’s performance evaluation
This class describes the performance of a predictor (classifier or regressor).
- Parameters:
- json_datadict, optional
JSON data of an element of the dictionary found at the
predictorPerformancesfield within the one of the evaluation report fields of a Khiops JSON report file. If not specified it returns an empty instance.Note
The
confusion_matrixfield is considered as “detail” and is not initialized in the constructor. Instead, it is initialized explicitly via theinit_detailsmethod. This allows to make partial initializations for large reports.
- Attributes:
- rankstr
An string index representing the order in the report.
- type“Classifier” or “Regressor”
Type of the predictor.
- namestr
Human readable name.
- data_grid
DataGrid Data grid representing the distribution of the target values per part of the descriptive variable in the evaluated dataset.
- accuracyfloat
Classifier only: Accuracy.
- compressionfloat
Classifier only: Compression rate.
- aucfloat
Classifier only: Area under the ROC curve.
- confusion_matrixConfusionMatrix
Classifier only: Confusion matrix.
- rmsefloat
Regressor only: Root mean square error.
- maefloat
Regressor only: Mean absolute error.
- nlpdfloat
Regressor only: Negative log predictive density.
- rank_rmsefloat
Regressor only: Root mean square error on the target’s value rank.
- rank_maefloat
Regressor only: Mean absolute error on the target’s value rank.
- rank_nlpdfloat
Regressor only: Negative log predictive density on the target’s value rank.
- get_metric(metric_name)¶
Returns the value of the specified metric
Note
The available metrics is available via the method
get_metric_names.- Parameters:
- metric_namestr
A metric name (case insensitive).
- Returns:
- float
The value of the specified metric.
- get_metric_names()¶
Returns the available univariate metrics
- Returns:
- list of str
The names of the available metrics.
- init_details(json_data=None)¶
Initializes the details’ attributes from a python JSON object
- is_detailed()¶
Returns True if the report contains any detailed information
- Returns:
- bool
True if the report contains any detailed information.
- write_report_details(writer)¶
Writes the details of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.PreparationReport(json_data=None)¶
Bases:
objectUnivariate data preparation report: discretizations and groupings
The attributes related to the target variable and null model are available only in the case of a supervised learning task (classification or regression).
- Parameters:
- json_datadict, optional
JSON data of the
preparationReportfield of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- report_type“Preparation” (only possible value)
Report type.
- dictionarystr
Name of the training data table dictionary.
- variable_typeslist of str
The different types of variables.
- variable_numberslist of int
Number of variables for each type. Synchronized with
variable_types.- databasestr
Path of the main training data table file.
- sample_percentageint
Percentage of instances used in training.
- sampling_modestr
Sampling mode used to split the train and datasets.
- selection_variablestr
Name of the variable used to select training instances.
- selection_valuestr
Value of
selection_variableto select training instance.- instance_numberint
Number of training instances.
- learning_taskstr
- Name of the associated learning task. Possible values:
“Classification analysis”
“Regression analysis”
“Unsupervised analysis”
- target_variablestr
Target variable name.
- main_target_valuestr
Main value of a categorical target variable.
- target_stats_minfloat
Minimum of a numerical target variable.
- target_stats_maxfloat
Maximum of a numerical target variable.
- target_stats_meanfloat
Mean of a numerical target variable.
- target_stats_std_devfloat
Standard deviation of a numerical target variable.
- target_stats_missing_numberint
Number of missing values for a numerical target variable.
- target_stats_modestr
Mode of a categorical target variable.
- target_stats_mode_frequencyint
Mode frequency of a categorical target variable.
- target_valueslist of str
Values of a categorical target variable.
- target_value_frequencieslist of int
Frequencies for each target value. Synchronized with
target_values.- evaluated_variable_numberint
Number of variables analyzed.
- informative_variable_numberint
Supervised analysis only: Number of informative variables.
- max_constructed_variablesint
Maximum number of constructed variable specified for the analysis.
- max_treesint
Maximum number of constructed trees specified for the analysis.
- max_pairsint
Maximum number of constructed variables pairs specified for the analysis.
- discretizationstr
Type of discretization method used.
- value_groupingstr
Type of grouping method used.
- null_model_construction_costfloat
Coding length of the null construction model.
- null_model_preparation_costfloat
Coding length of the null preparation model.
- null_model_data_costfloat
Coding length of the data given the null model.
- variables_statisticslist of
VariableStatistics Variable statistics for each variable analyzed.
- get_variable_names()¶
Returns the names of the variables analyzed during the preparation
- Returns:
- list of str
The names of the variables analyzed during the preparation.
- get_variable_statistics(variable_name)¶
Returns the statistics of the specified variable
- Parameters:
- variable_namestr
Name of the variable.
- Returns:
VariableStatisticsThe statistics of the specified variable.
- Raises:
KeyErrorIf no variable with the specified names exist.
- write_report(writer)¶
Writes the instance’s TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.SelectedVariable(json_data=None)¶
Bases:
objectInformation about a selected variable in a predictor
- Parameters:
- json_datadict, optional
JSON data representing an element of the
selectedVariableslist in thetrainedPredictorsDetailsfield within themodelingReportfield of a Khiops JSON report file. If not specified it returns an empty instance.
- Attributes:
- namestr
Human readable variable name.
- prepared_namestr
Internal variable name.
- levelfloat
Variable level.
- weightfloat
Variable weight in the model.
- importancefloat
A measure of overall importance of the variable in the model. It is the geometric mean of the level and weight.
- mapbool
True if the variable is in the MAP model. Deprecated: Will be removed in Khiops Python 11.
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.TrainedPredictor(json_data=None)¶
Bases:
objectTrained predictor information
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
trainedPredictorsfield within themodelingReportfield of a Khiops JSON report file. If not specified it returns an empty instance.Note
The
selected_variablesfield is considered a “detail” and is not initialized in the constructor. Instead, it is initialized explicitly via theinit_detailsmethod. This allows to make partial initializations for large reports.
- Attributes:
- typestr
Predictor type. Valid values are found in the
predictor_typesclass attribute. They are:“Selective Naive Bayes”
“MAP Naive Bayes” Deprecated
“Naive Bayes”
“Univariate”
- family“Classifier” or “Regressor”
Predictor family name. Valid values are found in the
predictor_familiesclass variable.- namestr
Human readable predictor name.
- variable_numberint
Number of variables used by the predictor.
- selected_variableslist of
SelectedVariable Variables used by the predictor. Only for types “Selective Naive Bayes” and “MAP Naive Bayes”.
- init_details(json_data=None)¶
Initializes the details’ attributes from a Python JSON object
- Parameters:
- json_datadict, optional
JSON data of the dictionary found at the
trainedPredictorsDetailsfield within themodelingReportfield of a Khiops JSON report file. If not specified it leaves the object as-is.
- is_detailed()¶
Returns True if the report contains any detailed information
- Returns:
- bool
True if the report contains any detailed information.
- write_report_details(writer)¶
Writes the details of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.VariablePairStatistics(json_data=None)¶
Bases:
objectVariable pair information and statistics
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
variablesPairStatisticsfield within thebivariatePreparationReportfield of a Khiops JSON report file. If not specified it returns an empty instance.Note
The
data_gridfield is considered as “detail” and is not initialized in the constructor. Instead, it is initialized explicitly via theinit_detailsmethod. This allows to make partial initializations for large reports. If not specified it returns an empty instance.
- Attributes:
- rankstr
Variable rank with respect to its level. Lower Rank = Higher Level.
- name1str
Name of the pair’s first variable.
- name2str
Name of the pair’s second variable.
- levelfloat
Predictive importance of the pair.
- level1float
Predictive importance of the first variable.
- level2float
Predictive importance of the second variable.
- delta_levelfloat
Difference between the pair’s level and the sum of those of its components (
delta_level = level - level1 - level2).- variable_numberint
- Number of active variables in the pair:
0 means that there is no information in any of the variables
1 means that the pair information reduces to that of any of its components
2 means that the two variables are jointly informative
- part_number1int
Number of parts of the first variable partition.
- part_number2int
Number of parts of the second variable partition.
- cell_numberint
Number of cells generated of the pair grid.
- construction_costfloat
Advanced: Construction cost of the variable. More complex variables cost more.
- preparation_costfloat
Advanced: Partition model cost. More complex partitions cost more.
- data_costfloat
Advanced: Negative log-likelihood of the variable given a preparation model and a construction model.
- data_grid
DataGrid A density estimation of the partitioned pair of variable with respect to the target.
- init_details(json_data=None)¶
Initializes the details’ attributes from a Python JSON object
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
variablesPairsDetailedStatisticsfield within thebivariatePreparationReportfield of a Khiops JSON report file. If not specified it leaves the object as-is.
- is_detailed()¶
Returns True if the report contains any detailed information
- Returns:
- bool
True if the report contains any detailed information.
- write_report_details(writer)¶
Writes the details’ attributes into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- class khiops.core.analysis_results.VariableStatistics(json_data=None)¶
Bases:
objectVariable information and statistics
Note
The statistics in this class are for both numerical and categorical data.
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
variablesStatisticsfield within thepreparationReportfield of a Khiops JSON report file. If not specified it returns an empty instance.Note
The
data_gridfield is considered a “detail” and is not initialized in the constructor. Instead, it is initialized explicitly via theinit_detailsmethod. This allows to make partial initializations for large reports. If not specified it returns an empty instance.
- Attributes:
- rankstr
Variable rank with respect to its level. Lower Rank = Higher Level.
- namestr
Variable name.
- typestr
- Variable type. Valid values:
“Numerical”
“Categorical”
“Date”
“Time”
“Timestamp”
“Table”
“Entity”
“Structure”
- levelfloat
Variable predictive importance.
- target_part_numberint
In regression: Number of the target intervals
In classification with target grouping: Number of target groups
- part_numberint
Number of parts of the variable partition.
- value_numberint
Number of distinct values of the variable.
- minfloat
Minimum value of the variable.
- maxfloat
Maximum value of the variable.
- meanfloat
Mean value of the variable.
- std_devfloat
Standard deviation of the variable.
- missing_numberint
Number of missing values of the variable.
- modefloat
Most common value.
- mode_frequencyint
Frequency of the most common value.
- input_valueslist of str
Different values taken by the variable. If there are too many values only the more frequent will be available.
- input_value_frequencieslist of int
The frequencies for each input value. Synchronized with
input_values.- construction_costfloat
Construction cost of the variable. More complex variables cost more.
- preparation_costfloat
Partition model cost. More complex partitions cost more.
- data_costfloat
Negative log-likelihood of the variable given a preparation model and a construction model.
- derivation_rulestr
If the variable is not native it is Khiops dictionary function to derive it. Otherwise is set to
None.- data_grid
DataGrid A density estimation of the partitioned variable with respect to the target.
- init_details(json_data=None)¶
Initializes the details’ attributes from a Python JSON object
- Parameters:
- json_datadict, optional
JSON data of an element of the list found at the
variablesDetailedStatisticsfield within thepreparationReportfield of a Khiops JSON report file. If not specified it leaves the object as-is.
- is_detailed()¶
Returns True if the report contains any detailed information
- Returns:
- bool
True if the report contains any detailed information.
- write_report_details(writer)¶
Writes the details’ attributes into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_header_line(writer)¶
Writes the header line of a TSV report into a writer object
The header is the same for all variable types.
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- write_report_line(writer)¶
Writes a line of the TSV report into a writer object
- Parameters:
- writer
KhiopsOutputWriter Output writer.
- writer
- khiops.core.analysis_results.read_analysis_results_file(json_file_path)¶
Reads a Khiops JSON report
- Parameters:
- json_file_pathstr
Path of the JSON report file.
- Returns:
AnalysisResultsAn instance of AnalysisResults containing the report’s information.
Examples
- See the following functions of the
samples.pydocumentation script: