arche.tools.schema

Module Contents

arche.tools.schema.basic_json_schema(data_source: str, items_numbers: List[int] = None) → Schema

Print a json schema based on the provided job_key and item numbers

Parameters
  • data_source – a collection or job key

  • items_numbers – array of item numbers to create schema from

arche.tools.schema.create_json_schema(source_key: str, items_numbers: Optional[List[int]] = None) → RawSchema

Create schema based on sampled source_key items.

arche.tools.schema.infer_schema(samples: List[Dict[str, Any]]) → RawSchema
arche.tools.schema.extend_schema(schema: SchemaObject) → None

Update schema with additional keywords inplace.

arche.tools.schema.set_item_no(items_count: int) → List[int]

Generate random numbers within items_count range

Returns

4 random numbers if items_count > 4 otherwise items numbers

arche.tools.schema.fast_validate(schema: RawSchema, raw_items: RawItems, keys: pd.Index) → Dict[str, set]

Verify items one by one. It stops after the first error in an item in most cases. Faster than jsonschema validation

Parameters
  • schema – a JSON schema

  • raw_items – a raw data to validate one by one

  • keys – keys corresponding to raw_items index

Returns

A dictionary of errors with message and item keys

arche.tools.schema.full_validate(schema: RawSchema, raw_items: RawItems, keys: pd.Index) → Dict[str, set]

This function uses jsonschema validator which returns all found error per item. See fast_validate() for arguments descriptions.

arche.tools.schema.format_validation_message(error_msg: str, path: Deque, schema_path: Deque, validator: str) → str