core.internals.io

Submodule of khiops.core.internals

Classes to handle Khiops specific I/O

Functions

encode_file_path

Encodes a file path

flexible_json_load

Loads flexibly a JSON file

Classes

KhiopsJSONObject

Represents the contents of a Khiops JSON file

KhiopsOutputWriter

Output writer with additional services to handle Khiops special encodings

class khiops.core.internals.io.KhiopsJSONObject(json_data=None)

Bases: object

Represents the contents of a Khiops JSON file

Parameters:
json_datadict, optional

Python dictionary representing the data of a Khiops JSON file. If None an empty it returns an empty object.

Attributes:
toolstr

Name of the Khiops tool that generated the file.

versionstr

Version of the Khiops tool that generated the file.

khiops_encodingstr, optional
Custom encoding used by Khiops in the file. Valid values:
  • None : for backwards compatibility

  • “ascii”: ASCII encoding

  • “ansi”: ANSI encoding

  • “utf8”: UTF-8 encoding

  • “mixed_ansi_utf8”: Mixed characters from UTF-8 and ANSI but no collision.

  • “colliding_ansi_utf8” : Colliding characters from UTF-8 and ANSI.

sub_toolstr, optional

Identifies the tool that originated the JSON file. Used by tools of the Khiops family such as PataText or Enneade.

Raises:
KhiopsJSONError

If the JSON data is invalid.

create_output_file_writer(stream)

Creates an output file with the proper encoding settings

Parameters:
streamio.IOBase

An output stream object.

Returns:
KhiopsOutputWriter

An output file object.

write_khiops_json_file(json_file_path)

Write the JSON data of the object to a Khiops JSON file

Parameters:
json_file_pathstr

Path to the Khiops JSON file.

class khiops.core.internals.io.KhiopsOutputWriter(stream, force_ansi=False, ansi_unicode_chars=None)

Bases: object

Output writer with additional services to handle Khiops special encodings

Parameters:
streamio.IOBase

A writable output stream. Special text transformations in buffers inheriting from io.TextIOBase are ignored.

force_ansibool, default False

All output written will be transformed back ANSI characters in that range that were recoded to UTF-8.

ansi_unicode_charslist of str, optional

A list of UTF-8 characters with equivalents in the ANSI 128-256 range which will be encoded back to ANSI when writing with force_ansi is True. By default all UTF-8 equivalents of the ANSI 128-256 will be encoded back.

khiops.core.internals.io.encode_file_path(file_path)

Encodes a file path

This is custom path encoding for Khiops scenarios that is platform dependent. The encoding is done only if file_path is of type str.

Parameters:
file_pathstr or bytes

The path of a file.

Returns:
bytes
If file_path is str
  • In Windows : The path decoded to UTF-8 excepting the “ANSI” Unicode characters.

  • In Linux/Unix/Mac : The path decoded to UTF-8.

If file_path is bytes:

It just returns the input file_path

Raises:
TypeError

If file_path is not str or bytes

khiops.core.internals.io.flexible_json_load(json_file_path)

Loads flexibly a JSON file

First it tries a vanilla read, then if that fails it warns and then loads the files replacing the errors.

Parameters:
json_file_pathstr

Path of the Khiops JSON file.

Returns:
dict

The in-memory representation of the JSON file.