arche.arche
¶
Module Contents¶
-
arche.arche.
logger
¶
-
class
arche.arche.
Arche
(source: Union[str, pd.DataFrame, RawItems], schema: Optional[SchemaSource] = None, target: Optional[Union[str, pd.DataFrame]] = None, count: Optional[int] = None, start: Union[str, int] = None, filters: Optional[api.Filters] = None, expand: bool = None)¶ -
property
source_items
(self)¶
-
property
target_items
(self)¶
-
property
schema
(self)¶
-
static
get_items
(source: Union[str, pd.DataFrame, RawItems], count: Optional[int], start: Optional[str], filters: Optional[api.Filters])¶
-
save_result
(self, rule_result)¶
-
report_all
(self, short: bool = False, uniques: List[Union[str, List[str]]] = None)¶ Report on all included rules.
- Parameters
uniques – see arche.rules.duplicates.find_by
-
run_all_rules
(self)¶
-
data_quality_report
(self, bucket: Optional[str] = None)¶
-
run_general_rules
(self)¶
-
validate_with_json_schema
(self)¶ Run JSON schema check and output results. It will try to find all errors, but there are no guarantees. Slower than check_with_json_schema()
-
glance
(self)¶ Run JSON schema check and output results. In most cases it will return only the first error per item. Usable for big jobs as it’s about 100x faster than validate_with_json_schema().
-
run_schema_rules
(self)¶
-
run_customized_rules
(self, items, tagged_fields)¶
-
check_metadata
(self, job)¶
-
compare_metadata
(self, source_job, target_job)¶
-
run_comparison_rules
(self)¶
-
compare_with_customized_rules
(self, source_items, target_items, tagged_fields)¶
-
property