Skip to content

Schema

toolbox_pyspark.schema 🔗

Summary

The schema module is used for checking, validating, and viewing any schema differences between two different tables, either from in-memory variables, or pointing to locations on disk.

check_schemas_match 🔗

check_schemas_match(
    method: str = "by_table_and_table",
    left_table: Optional[psDataFrame] = None,
    right_table: Optional[psDataFrame] = None,
    left_table_path: Optional[str] = None,
    left_table_name: Optional[str] = None,
    right_table_path: Optional[str] = None,
    right_table_name: Optional[str] = None,
    spark_session: Optional[SparkSession] = None,
    left_table_format: str = "delta",
    right_table_format: str = "delta",
    include_change_field: bool = True,
    include_add_field: bool = True,
    include_remove_field: bool = True,
    include_change_nullable: bool = False,
    return_object: Literal["results", "check"] = "check",
) -> Union[list[tuple[str, dict[str, StructField]]], bool]

Summary

Check the schemas between two different tables.

Details

This function is heavily inspired by other packages which check and validate schema differences for pyspark tables. This function just streamlines it a bit, and adds additional functionality for whether or not table on either left or right side is already in-memory or sitting on a directory somewhere else.

Parameters:

Name Type Description Default
method str

The method to use for the comparison. That is, is either side a table in memory or is it a table sitting on a path?. Check the Notes section for all options available for this parameter.
Defaults to "by_table_and_table".

'by_table_and_table'
spark_session Optional[SparkSession]

The SparkSession to use if either the left or right tables are sitting on a path somewhere.
Defaults to None.

None
left_table Optional[DataFrame]

If method defines the left table as a table, then this parameter is the actual dataframe to do the checking against.
Defaults to None.

None
left_table_path Optional[str]

If method defines the left table as a path, then this parameter is the actual path location where the table can be found.
Defaults to None.

None
left_table_name Optional[str]

If method defines the left table as a path, then this parameter is the name of the table found at the given left_table_path location.
Defaults to None.

None
left_table_format str

If method defines the left table as a path, then this parameter is the format of the table found at the given left_table_path location.
Defaults to "delta".

'delta'
right_table Optional[DataFrame]

If method defines the right table as a table, then this parameter is the actual dataframe to do the checking against.
Defaults to None.

None
right_table_path Optional[str]

If method defines the right table as a path, then this parameter is the actual path location where the table can be found.
Defaults to None.

None
right_table_name Optional[str]

If method defines the right table as a path, then this parameter is the name of the table found at the given right_table_path location.
Defaults to None.

None
right_table_format str

If method defines the right table as a path, then this parameter is the format of the table found at the given right_table_path location.
Defaults to "delta".

'delta'
include_change_field bool

When doing the schema validations, do you want to include any fields where the data-type on the right-hand side is different from the left-hand side?
This can be read as: "What fields have had their data type changed between the left-hand side and the right-hand side?"
Defaults to True.

True
include_add_field bool

When doing the schema validations, do you want to include any fields that have had any additional fields added to the left-hand side, when compared to the right-hand side?
This can be read as: "What fields have been added to the left-hand side?"
Defaults to True.

True
include_remove_field bool

When doing the schema validations, do you want to include any fields which are missing from the left-hand side and only existing on the right-hand side?
This can be read as: "What fields been removed from the left-hand side?"
Defaults to True.

True
include_change_nullable bool

When doing the schema validations, do you want to include any fields which have had their nullability metadata changed on the right-hand side, when compared to the left-hand side?.
This can be read as: "What fields had their nullability changed between the left-hand side and the right-hand side?"
Defaults to False.

False
return_object Literal['results', 'check']

After having checked the schema, how do you want the results to be returned? If "check", then will only return a bool value: True if the schemas actually match, False if there are any differences. If "results", then the actual schema differences will be returned. Check the Notes section for more information on the structure of this object.
Defaults to "check".

'check'

Raises:

Type Description
TypeError

If any of the inputs parsed to the parameters of this function are not the correct type. Uses the @typeguard.typechecked decorator.

AttributeError

If the value parse'd to method is not a valid option.

Returns:

Type Description
Union[list[tuple[str, dict[str, StructField]]], bool]

If return_object is "results", then this will be a list of tuple's of dict's containing the details of the schema differences. If return_object is "check", then it will only be a bool object about whether the schemas match or not.

Examples

Set up
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
>>> # Imports
>>> from pprint import pprint
>>> import pandas as pd
>>> from pyspark.sql import SparkSession, functions as F
>>> from toolbox_pyspark.schema import check_schemas_match
>>> from toolbox_pyspark.io import write_to_path
>>> from toolbox_pyspark.checks import table_exists
>>>
>>> # Instantiate Spark
>>> spark = SparkSession.builder.getOrCreate()
>>>
>>> # Create data
>>> df1 = spark.createDataFrame(
...     pd.DataFrame(
...         {
...             "a": [0, 1, 2, 3],
...             "b": ["a", "b", "c", "d"],
...             "c": ["1", "1", "1", "1"],
...             "d": ["2", "2", "2", "2"],
...             "e": ["3", "3", "3", "3"],
...             "f": ["4", "4", "4", "4"],
...         }
...     )
... )
>>> df2 = (
...     df1.withColumn("c", F.col("c").cast("int"))
...     .withColumn("g", F.lit("a"))
...     .withColumn("d", F.lit("null"))
...     .drop("e")
... )
>>> write_to_path(
...     table=df1,
...     name="left",
...     path="./test",
...     data_format="parquet",
...     mode="overwrite",
...     write_options={"overwriteSchema": "true"},
... )
>>> write_to_path(
...     table=df2,
...     name="right",
...     path="./test",
...     data_format="parquet",
...     mode="overwrite",
...     write_options={"overwriteSchema": "true"},
... )
>>>
>>> # Check
>>> pprint(df1.dtypes)
>>> print(df1.show())
>>> print(table_exists("left", "./test", "parquet", spark))
>>> pprint(df2.dtypes)
>>> print(df2.show())
>>> print(table_exists("right", "./test", "parquet", spark))
Terminal
[
    ("a", "bigint"),
    ("b", "string"),
    ("c", "string"),
    ("d", "string"),
    ("e", "string"),
    ("f", "string"),
]
Terminal
+---+---+---+---+---+---+
| a | b | c | d | e | f |
+---+---+---+---+---+---+
| 0 | a | 1 | 2 | 3 | 4 |
| 1 | b | 1 | 2 | 3 | 4 |
| 2 | c | 1 | 2 | 3 | 4 |
| 3 | d | 1 | 2 | 3 | 4 |
+---+---+---+---+---+---+
Terminal
True
Terminal
[
    ("a", "bigint"),
    ("b", "string"),
    ("c", "int"),
    ("d", "string"),
    ("f", "string"),
    ("g", "string"),
]
Terminal
+---+---+---+------+---+---+
| a | b | c |    d | f | g |
+---+---+---+------+---+---+
| 0 | a | 1 | null | 4 | 2 |
| 1 | b | 1 | null | 4 | 2 |
| 2 | c | 1 | null | 4 | 2 |
| 3 | d | 1 | null | 4 | 2 |
+---+---+---+------+---+---+
Terminal
True

Example 1: Check matching
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> diff = check_schemas_match(
...     method="table_table",
...     left_table=df1,
...     right_table=df1,
...     include_add_field=True,
...     include_change_field=True,
...     include_remove_field=True,
...     include_change_nullable=True,
...     return_object="check",
... )
>>> print(diff)
Terminal
True

Conclusion: Schemas match.

Example 2: Check not matching
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> diff = check_schemas_match(
...     method="table_table",
...     left_table=df1,
...     right_table=df2,
...     include_add_field=True,
...     include_change_field=True,
...     include_remove_field=True,
...     include_change_nullable=True,
...     return_object="check",
... )
>>> print(diff)
Terminal
False

Conclusion: Schemas do not match.

Example 3: Show only `add`
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> diff = check_schemas_match(
...     method="table_table",
...     left_table=df1,
...     right_table=df2,
...     include_add_field=True,
...     include_change_field=False,
...     include_remove_field=False,
...     include_change_nullable=False,
...     return_object="results",
... )
>>> print(diff)
Terminal
[
    (
        "add",
        {"left": T.StructField("e", T.StringType(), False)},
    ),
]

Conclusion: Schemas do not match because the e field was added.

Example 4: Show `add` and `remove`
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> diff = check_schemas_match(
...     method="table_table",
...     left_table=df1,
...     right_table=df2,
...     include_add_field=True,
...     include_change_field=False,
...     include_remove_field=True,
...     include_change_nullable=False,
...     return_object="results",
... )
>>> print(diff)
Terminal
[
    (
        "add",
        {"left": T.StructField("e", T.StringType(), False)},
    ),
    (
        "remove",
        {"right": T.StructField("g", T.StringType(), False)},
    ),
]

Conclusion: Schemas do not match because the e field was added and the g field was removed.

Example 5: Show all changes
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
>>> diff = check_schemas_match(
...     method="table_table",
...     left_table=df1,
...     right_table=df2,
...     include_add_field=True,
...     include_change_field=True,
...     include_remove_field=True,
...     include_change_nullable=True,
...     return_object="results",
... )
>>> print(diff)
Terminal
[
    (
        "add",
        {"left": T.StructField("e", T.StringType(), False)},
    ),
    (
        "remove",
        {"right": T.StructField("g", T.StringType(), False)},
    ),
    (
        "change_type",
        {
            "left": T.StructField("c", T.StringType(), False),
            "right": T.StructField("c", T.IntegerType(), True),
        },
    ),
    (
        "change_nullable",
        {
            "left": T.StructField("c", T.StringType(), False),
            "right": T.StructField("c", T.IntegerType(), True),
        },
    ),
]

Conclusion: Schemas do not match because the e field was added, the g field was removed, the c field had its data type changed, and the c field had its nullability changed.

Example 6: Check where right-hand side is a `path`
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
>>> diff = check_schemas_match(
...     method="path_table",
...     spark_session=spark,
...     left_table=df1,
...     right_table_path="./test",
...     right_table_name="right",
...     right_table_format="parquet",
...     include_add_field=True,
...     include_change_field=False,
...     include_remove_field=False,
...     include_change_nullable=False,
...     return_object="results",
... )
>>> print(diff)
Terminal
[
    (
        "add",
        {"left": T.StructField("e", T.StringType(), False)},
    ),
]

Conclusion: Schemas do not match because the e field was added.

Example 7: Check where both sides are a `path`
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
>>> diff = check_schemas_match(
...     method="path_path",
...     spark_session=spark,
...     left_table_path="./test",
...     left_table_name="left",
...     left_table_format="parquet",
...     right_table_path="./test",
...     right_table_name="right",
...     right_table_format="parquet",
...     include_add_field=False,
...     include_change_field=True,
...     include_remove_field=False,
...     include_change_nullable=False,
...     return_object="results",
... )
>>> print(diff)
Terminal
[
    (
        "remove",
        {"right": T.StructField("g", T.StringType(), True)},
    ),
]

Conclusion: Schemas do not match because the g field was removed.

Example 8: Invalid `method` parameter
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
>>> diff = check_schemas_match(
...     method="invalid",
...     left_table=df1,
...     right_table=df2,
...     include_add_field=True,
...     include_change_field=True,
...     include_remove_field=True,
...     include_change_nullable=True,
...     return_object="check",
... )
AttributeError: Invalid value for `method`: 'invalid'
Please use one of the following options:
- For `by_table_and_table`, use one of the following values: ['table', 'table_table', 'tables', 'by_table', 'by_table_and_table', 'table_and_table']
- For `by_table_and_path`, use one of the following values: ['table_and_path', 'table_path', 'by_table_and_path']
- For `by_path_and_table`, use one of the following values: ['path_and_table', 'path_table', 'by_path_and_table']
- For `by_path_and_path`, use one of the following values: ['path_and_path', 'path_path', 'by_path_and_path', 'path', 'paths']

Conclusion: Invalid method parameter.

Notes
Options available in the method parameter

The options available in the method parameter include:

  • If the objects on both the left-hand side and the right-hand side are both dataframes already loaded to memory, use one of the following values:
    • "table"
    • "table_table"
    • "tables"
    • "by_table"
    • "by_table_and_table"
    • "table_and_table"
  • If the object on the left-hand side is a dataframe already loaded to memory, but the object on the right-hand side is a table sitting on a path somewhere, use one of the following values:
    • "table_and_path"
    • "table_path"
    • "by_table_and_path"
  • If the object on the left-hand side is a table sitting on a path somewhere, but the object on the right-hand side is a dataframe already loaded to memory, use one of the following values:
    • "path_and_table"
    • "path_table"
    • "by_path_and_table"
  • If the objects on both the left-hand side and the right-hand side are both tables sitting on a path somewhere, then use one of the following values:
    • "path_and_path"
    • "path_path"
    • "by_path_and_path"
    • "path"
    • "paths"
  • Details about the return object when we set the parameter return_object="results"
    • When we set the parameter return_object="results", then we will get an object returned from this function.
    • That object will be a list of tuple's, each tuple is only two-elements long, where the first element is a str object, and the second is a dict where the keys are str and the values are a StructField object.
    • For each of the tuple elements, the first element (the str object) describes what the tuple is there for. It will be one of four words: "add", "remove", "change_type", or "change_nullable".
    • You can change whether these options are included in the schema check by changing the other parameters: include_change_field, include_add_field, include_remove_field, include_change_nullable.
    • The structure of the list will look like this:
    [
        (
            "add",  # (1)!
            {"left": T.StructField("e", T.StringType(), False)},  # (2)!
        ),
        (
            "add",  # (3)!
            {"left": T.StructField("h", T.StringType(), False)},
        ),
        (
            "remove",  # (4)!
            {"right": T.StructField("g", T.StringType(), False)},  # (5)!
        ),
        (
            "change_type",  # (6)!
            {
                "left": T.StructField("c", T.StringType(), False),  # (7)!
                "right": T.StructField("c", T.IntegerType(), True),
            },
        ),
        (
            "change_nullable",  # (8)!
            {
                "left": T.StructField("c", T.StringType(), False),  # (9)!
                "right": T.StructField("c", T.IntegerType(), True),
            },
        ),
    ]
    
    1. When include_add_field=True, then the add section will always appear first.
      If include_add_field=False, then this section is omitted.
    2. The second element of the tuple is a dict that has only one key-value pair.
      The key will always be the value "left", because these are fields which have been added to the table on the left-hand side and not found on the right-hand side.
    3. When there are multiple fields added to the table on the left-hand side, they will appear like this.
    4. When include_remove_field=True, then the remove section will always appear next.
      If include_remove_field=False, then this section is omitted.
    5. The second element of the tuple is a dict that has only one key-value pair.
      The key will always be the value "right", because these are fields which have been removed from the left-hand side and only visible on the right-hand side.
    6. When include_change_field=True, then the change_type section will always appear next.
      If include_change_field=False, then this section is omitted.
    7. The second element of the tuple is a dict that has two key-value pairs.
      The key's will always be the values "left" then "right", because these are fields where the data type has changed between the left-hand side and the right-hand side, and therefore you need to see both to see exactly what has changed.
    8. When include_change_nullable=True, then the change_nullable section will always appear next.
      If include_change_nullable=False, then this section is omitted.
    9. The sectond element of the tuple is a dict that has two key-value pairs.
      The key's will always be the values "left" then "right", because these are fields where the nullability of the firlds are changed between the left-hand side and the right-hand side, and therefore you need to see both to see exactly what has changed.
    Source code in src/toolbox_pyspark/schema.py
    317
    318
    319
    320
    321
    322
    323
    324
    325
    326
    327
    328
    329
    330
    331
    332
    333
    334
    335
    336
    337
    338
    339
    340
    341
    342
    343
    344
    345
    346
    347
    348
    349
    350
    351
    352
    353
    354
    355
    356
    357
    358
    359
    360
    361
    362
    363
    364
    365
    366
    367
    368
    369
    370
    371
    372
    373
    374
    375
    376
    377
    378
    379
    380
    381
    382
    383
    384
    385
    386
    387
    388
    389
    390
    391
    392
    393
    394
    395
    396
    397
    398
    399
    400
    401
    402
    403
    404
    405
    406
    407
    408
    409
    410
    411
    412
    413
    414
    415
    416
    417
    418
    419
    420
    421
    422
    423
    424
    425
    426
    427
    428
    429
    430
    431
    432
    433
    434
    435
    436
    437
    438
    439
    440
    441
    442
    443
    444
    445
    446
    447
    448
    449
    450
    451
    452
    453
    454
    455
    456
    457
    458
    459
    460
    461
    462
    463
    464
    465
    466
    467
    468
    469
    470
    471
    472
    473
    474
    475
    476
    477
    478
    479
    480
    481
    482
    483
    484
    485
    486
    487
    488
    489
    490
    491
    492
    493
    494
    495
    496
    497
    498
    499
    500
    501
    502
    503
    504
    505
    506
    507
    508
    509
    510
    511
    512
    513
    514
    515
    516
    517
    518
    519
    520
    521
    522
    523
    524
    525
    526
    527
    528
    529
    530
    531
    532
    533
    534
    535
    536
    537
    538
    539
    540
    541
    542
    543
    544
    545
    546
    547
    548
    549
    550
    551
    552
    553
    554
    555
    556
    557
    558
    559
    560
    561
    562
    563
    564
    565
    566
    567
    568
    569
    570
    571
    572
    573
    574
    575
    576
    577
    578
    579
    580
    581
    582
    583
    584
    585
    586
    587
    588
    589
    590
    591
    592
    593
    594
    595
    596
    597
    598
    599
    600
    601
    602
    603
    604
    605
    606
    607
    608
    609
    610
    611
    612
    613
    614
    615
    616
    617
    618
    619
    620
    621
    622
    623
    624
    625
    626
    627
    628
    629
    630
    631
    632
    633
    634
    635
    636
    637
    638
    639
    640
    641
    642
    643
    644
    645
    646
    647
    648
    649
    650
    651
    652
    653
    654
    655
    656
    657
    658
    659
    660
    661
    662
    663
    664
    665
    666
    667
    668
    669
    670
    671
    672
    673
    674
    675
    676
    677
    678
    679
    680
    681
    682
    683
    684
    685
    686
    687
    688
    689
    690
    691
    692
    693
    694
    695
    696
    697
    698
    699
    700
    701
    702
    703
    704
    705
    706
    707
    708
    709
    710
    711
    712
    713
    714
    715
    716
    717
    718
    719
    720
    721
    722
    723
    724
    725
    726
    727
    728
    729
    730
    731
    732
    733
    734
    735
    736
    737
    738
    739
    740
    741
    742
    743
    744
    745
    746
    747
    748
    749
    750
    751
    752
    753
    754
    755
    756
    757
    758
    759
    760
    761
    762
    763
    764
    765
    766
    767
    768
    769
    770
    771
    772
    773
    774
    775
    776
    777
    778
    779
    780
    781
    782
    783
    784
    785
    786
    787
    788
    789
    790
    791
    792
    793
    794
    795
    796
    797
    798
    799
    800
    801
    802
    803
    804
    805
    806
    807
    808
    809
    810
    811
    812
    813
    814
    815
    816
    817
    818
    819
    820
    821
    822
    823
    824
    825
    826
    827
    828
    829
    830
    831
    832
    833
    834
    835
    836
    837
    838
    839
    840
    841
    842
    843
    844
    845
    846
    847
    848
    849
    850
    851
    852
    853
    854
    855
    856
    857
    858
    859
    860
    861
    862
    863
    864
    865
    866
    867
    868
    869
    870
    871
    872
    873
    874
    875
    876
    877
    878
    879
    880
    881
    882
    883
    884
    885
    886
    887
    888
    889
    890
    891
    892
    893
    894
    895
    896
    897
    898
    899
    900
    @typechecked
    def check_schemas_match(
        method: str = "by_table_and_table",
        left_table: Optional[psDataFrame] = None,
        right_table: Optional[psDataFrame] = None,
        left_table_path: Optional[str] = None,
        left_table_name: Optional[str] = None,
        right_table_path: Optional[str] = None,
        right_table_name: Optional[str] = None,
        spark_session: Optional[SparkSession] = None,
        left_table_format: str = "delta",
        right_table_format: str = "delta",
        include_change_field: bool = True,
        include_add_field: bool = True,
        include_remove_field: bool = True,
        include_change_nullable: bool = False,
        return_object: Literal["results", "check"] = "check",
    ) -> Union[list[tuple[str, dict[str, StructField]]], bool]:
        """
        !!! note "Summary"
            Check the schemas between two different tables.
    
        ???+ abstract "Details"
            This function is heavily inspired by other packages which check and validate schema differences for `pyspark` tables. This function just streamlines it a bit, and adds additional functionality for whether or not table on either `left` or `right` side is already in-memory or sitting on a directory somewhere else.
    
        Params:
            method (str, optional):
                The method to use for the comparison. That is, is either side a table in memory or is it a `table` sitting on a `path`?. Check the Notes section for all options available for this parameter.<br>
                Defaults to `#!py "by_table_and_table"`.
            spark_session (Optional[SparkSession], optional):
                The `SparkSession` to use if either the `left` or `right` tables are sitting on a `path` somewhere.<br>
                Defaults to `#!py None`.
            left_table (Optional[psDataFrame], optional):
                If `method` defines the `left` table as a `table`, then this parameter is the actual `dataframe` to do the checking against.<br>
                Defaults to `#!py None`.
            left_table_path (Optional[str], optional):
                If `method` defines the `left` table as a `path`, then this parameter is the actual path location where the table can be found.<br>
                Defaults to `#!py None`.
            left_table_name (Optional[str], optional):
                If `method` defines the `left` table as a `path`, then this parameter is the name of the table found at the given `left_table_path` location.<br>
                Defaults to `#!py None`.
            left_table_format (str, optional):
                If `method` defines the `left` table as a `path`, then this parameter is the format of the table found at the given `left_table_path` location.<br>
                Defaults to `#!py "delta"`.
            right_table (Optional[psDataFrame], optional):
                If `method` defines the `right` table as a `table`, then this parameter is the actual `dataframe` to do the checking against.<br>
                Defaults to `#!py None`.
            right_table_path (Optional[str], optional):
                If `method` defines the `right` table as a `path`, then this parameter is the actual path location where the table can be found.<br>
                Defaults to `#!py None`.
            right_table_name (Optional[str], optional):
                If `method` defines the `right` table as a `path`, then this parameter is the name of the table found at the given `right_table_path` location.<br>
                Defaults to `#!py None`.
            right_table_format (str, optional):
                If `method` defines the `right` table as a `path`, then this parameter is the format of the table found at the given `right_table_path` location.<br>
                Defaults to `#!py "delta"`.
            include_change_field (bool, optional):
                When doing the schema validations, do you want to include any fields where the data-type on the right-hand side is different from the left-hand side?<br>
                This can be read as: "What fields have had their data type _changed **between**_ the left-hand side and the right-hand side?"<br>
                Defaults to `#!py True`.
            include_add_field (bool, optional):
                When doing the schema validations, do you want to include any fields that have had any additional fields added to the left-hand side, when compared to the right-hand side?<br>
                This can be read as: "What fields have been _added **to**_ the left-hand side?"<br>
                Defaults to `#!py True`.
            include_remove_field (bool, optional):
                When doing the schema validations, do you want to include any fields which are missing from the left-hand side and only existing on the right-hand side?<br>
                This can be read as: "What fields been _removed **from**_ the left-hand side?"<br>
                Defaults to `#!py True`.
            include_change_nullable (bool, optional):
                When doing the schema validations, do you want to include any fields which have had their nullability metadata changed on the right-hand side, when compared to the left-hand side?.<br>
                This can be read as: "What fields had their nullability _changed **between**_ the left-hand side and the right-hand side?"<br>
                Defaults to `#!py False`.
            return_object (Literal["results", "check"], optional):
                After having checked the schema, how do you want the results to be returned? If `#!py "check"`, then will only return a `#!py bool` value: `#!py True` if the schemas actually match, `#!py False` if there are any differences. If `#!py "results"`, then the actual schema differences will be returned. Check the Notes section for more information on the structure of this object.<br>
                Defaults to `#!py "check"`.
    
        Raises:
            TypeError:
                If any of the inputs parsed to the parameters of this function are not the correct type. Uses the [`@typeguard.typechecked`](https://typeguard.readthedocs.io/en/stable/api.html#typeguard.typechecked) decorator.
            AttributeError:
                If the value parse'd to `method` is not a valid option.
    
        Returns:
            (Union[list[tuple[str, dict[str, StructField]]], bool]):
                If `return_object` is `#!py "results"`, then this will be a `#!py list` of `#!py tuple`'s of `#!py dict`'s containing the details of the schema differences. If `return_object` is `#!py "check"`, then it will only be a `#!py bool` object about whether the schemas match or not.
    
        ???+ example "Examples"
    
            ```{.py .python linenums="1" title="Set up"}
            >>> # Imports
            >>> from pprint import pprint
            >>> import pandas as pd
            >>> from pyspark.sql import SparkSession, functions as F
            >>> from toolbox_pyspark.schema import check_schemas_match
            >>> from toolbox_pyspark.io import write_to_path
            >>> from toolbox_pyspark.checks import table_exists
            >>>
            >>> # Instantiate Spark
            >>> spark = SparkSession.builder.getOrCreate()
            >>>
            >>> # Create data
            >>> df1 = spark.createDataFrame(
            ...     pd.DataFrame(
            ...         {
            ...             "a": [0, 1, 2, 3],
            ...             "b": ["a", "b", "c", "d"],
            ...             "c": ["1", "1", "1", "1"],
            ...             "d": ["2", "2", "2", "2"],
            ...             "e": ["3", "3", "3", "3"],
            ...             "f": ["4", "4", "4", "4"],
            ...         }
            ...     )
            ... )
            >>> df2 = (
            ...     df1.withColumn("c", F.col("c").cast("int"))
            ...     .withColumn("g", F.lit("a"))
            ...     .withColumn("d", F.lit("null"))
            ...     .drop("e")
            ... )
            >>> write_to_path(
            ...     table=df1,
            ...     name="left",
            ...     path="./test",
            ...     data_format="parquet",
            ...     mode="overwrite",
            ...     write_options={"overwriteSchema": "true"},
            ... )
            >>> write_to_path(
            ...     table=df2,
            ...     name="right",
            ...     path="./test",
            ...     data_format="parquet",
            ...     mode="overwrite",
            ...     write_options={"overwriteSchema": "true"},
            ... )
            >>>
            >>> # Check
            >>> pprint(df1.dtypes)
            >>> print(df1.show())
            >>> print(table_exists("left", "./test", "parquet", spark))
            >>> pprint(df2.dtypes)
            >>> print(df2.show())
            >>> print(table_exists("right", "./test", "parquet", spark))
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [
                ("a", "bigint"),
                ("b", "string"),
                ("c", "string"),
                ("d", "string"),
                ("e", "string"),
                ("f", "string"),
            ]
            ```
            ```{.txt .text title="Terminal"}
            +---+---+---+---+---+---+
            | a | b | c | d | e | f |
            +---+---+---+---+---+---+
            | 0 | a | 1 | 2 | 3 | 4 |
            | 1 | b | 1 | 2 | 3 | 4 |
            | 2 | c | 1 | 2 | 3 | 4 |
            | 3 | d | 1 | 2 | 3 | 4 |
            +---+---+---+---+---+---+
            ```
            ```{.sh .shell title="Terminal"}
            True
            ```
            ```{.sh .shell title="Terminal"}
            [
                ("a", "bigint"),
                ("b", "string"),
                ("c", "int"),
                ("d", "string"),
                ("f", "string"),
                ("g", "string"),
            ]
            ```
            ```{.txt .text title="Terminal"}
            +---+---+---+------+---+---+
            | a | b | c |    d | f | g |
            +---+---+---+------+---+---+
            | 0 | a | 1 | null | 4 | 2 |
            | 1 | b | 1 | null | 4 | 2 |
            | 2 | c | 1 | null | 4 | 2 |
            | 3 | d | 1 | null | 4 | 2 |
            +---+---+---+------+---+---+
            ```
            ```{.sh .shell title="Terminal"}
            True
            ```
            </div>
    
            ```{.py .python linenums="1" title="Example 1: Check matching"}
            >>> diff = check_schemas_match(
            ...     method="table_table",
            ...     left_table=df1,
            ...     right_table=df1,
            ...     include_add_field=True,
            ...     include_change_field=True,
            ...     include_remove_field=True,
            ...     include_change_nullable=True,
            ...     return_object="check",
            ... )
            >>> print(diff)
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            True
            ```
            !!! success "Conclusion: Schemas match."
            </div>
    
            ```{.py .python linenums="1" title="Example 2: Check not matching"}
            >>> diff = check_schemas_match(
            ...     method="table_table",
            ...     left_table=df1,
            ...     right_table=df2,
            ...     include_add_field=True,
            ...     include_change_field=True,
            ...     include_remove_field=True,
            ...     include_change_nullable=True,
            ...     return_object="check",
            ... )
            >>> print(diff)
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            False
            ```
            !!! failure "Conclusion: Schemas do not match."
            </div>
    
            ```{.py .python linenums="1" title="Example 3: Show only `add`"}
            >>> diff = check_schemas_match(
            ...     method="table_table",
            ...     left_table=df1,
            ...     right_table=df2,
            ...     include_add_field=True,
            ...     include_change_field=False,
            ...     include_remove_field=False,
            ...     include_change_nullable=False,
            ...     return_object="results",
            ... )
            >>> print(diff)
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [
                (
                    "add",
                    {"left": T.StructField("e", T.StringType(), False)},
                ),
            ]
            ```
            !!! failure "Conclusion: Schemas do not match because the `e` field was added."
            </div>
    
            ```{.py .python linenums="1" title="Example 4: Show `add` and `remove`"}
            >>> diff = check_schemas_match(
            ...     method="table_table",
            ...     left_table=df1,
            ...     right_table=df2,
            ...     include_add_field=True,
            ...     include_change_field=False,
            ...     include_remove_field=True,
            ...     include_change_nullable=False,
            ...     return_object="results",
            ... )
            >>> print(diff)
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [
                (
                    "add",
                    {"left": T.StructField("e", T.StringType(), False)},
                ),
                (
                    "remove",
                    {"right": T.StructField("g", T.StringType(), False)},
                ),
            ]
            ```
            !!! failure "Conclusion: Schemas do not match because the `e` field was added and the `g` field was removed."
            </div>
    
            ```{.py .python linenums="1" title="Example 5: Show all changes"}
            >>> diff = check_schemas_match(
            ...     method="table_table",
            ...     left_table=df1,
            ...     right_table=df2,
            ...     include_add_field=True,
            ...     include_change_field=True,
            ...     include_remove_field=True,
            ...     include_change_nullable=True,
            ...     return_object="results",
            ... )
            >>> print(diff)
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [
                (
                    "add",
                    {"left": T.StructField("e", T.StringType(), False)},
                ),
                (
                    "remove",
                    {"right": T.StructField("g", T.StringType(), False)},
                ),
                (
                    "change_type",
                    {
                        "left": T.StructField("c", T.StringType(), False),
                        "right": T.StructField("c", T.IntegerType(), True),
                    },
                ),
                (
                    "change_nullable",
                    {
                        "left": T.StructField("c", T.StringType(), False),
                        "right": T.StructField("c", T.IntegerType(), True),
                    },
                ),
            ]
            ```
            !!! failure "Conclusion: Schemas do not match because the `e` field was added, the `g` field was removed, the `c` field had its data type changed, and the `c` field had its nullability changed."
            </div>
    
            ```{.py .python linenums="1" title="Example 6: Check where right-hand side is a `path`"}
            >>> diff = check_schemas_match(
            ...     method="path_table",
            ...     spark_session=spark,
            ...     left_table=df1,
            ...     right_table_path="./test",
            ...     right_table_name="right",
            ...     right_table_format="parquet",
            ...     include_add_field=True,
            ...     include_change_field=False,
            ...     include_remove_field=False,
            ...     include_change_nullable=False,
            ...     return_object="results",
            ... )
            >>> print(diff)
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [
                (
                    "add",
                    {"left": T.StructField("e", T.StringType(), False)},
                ),
            ]
            ```
            !!! failure "Conclusion: Schemas do not match because the `e` field was added."
            </div>
    
            ```{.py .python linenums="1" title="Example 7: Check where both sides are a `path`"}
            >>> diff = check_schemas_match(
            ...     method="path_path",
            ...     spark_session=spark,
            ...     left_table_path="./test",
            ...     left_table_name="left",
            ...     left_table_format="parquet",
            ...     right_table_path="./test",
            ...     right_table_name="right",
            ...     right_table_format="parquet",
            ...     include_add_field=False,
            ...     include_change_field=True,
            ...     include_remove_field=False,
            ...     include_change_nullable=False,
            ...     return_object="results",
            ... )
            >>> print(diff)
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [
                (
                    "remove",
                    {"right": T.StructField("g", T.StringType(), True)},
                ),
            ]
            ```
            !!! failure "Conclusion: Schemas do not match because the `g` field was removed."
            </div>
    
            ```{.py .python linenums="1" title="Example 8: Invalid `method` parameter"}
            >>> diff = check_schemas_match(
            ...     method="invalid",
            ...     left_table=df1,
            ...     right_table=df2,
            ...     include_add_field=True,
            ...     include_change_field=True,
            ...     include_remove_field=True,
            ...     include_change_nullable=True,
            ...     return_object="check",
            ... )
            ```
            <div class="result" markdown>
            ```{.py .python .title="Terminal"}
            AttributeError: Invalid value for `method`: 'invalid'
            Please use one of the following options:
            - For `by_table_and_table`, use one of the following values: ['table', 'table_table', 'tables', 'by_table', 'by_table_and_table', 'table_and_table']
            - For `by_table_and_path`, use one of the following values: ['table_and_path', 'table_path', 'by_table_and_path']
            - For `by_path_and_table`, use one of the following values: ['path_and_table', 'path_table', 'by_path_and_table']
            - For `by_path_and_path`, use one of the following values: ['path_and_path', 'path_path', 'by_path_and_path', 'path', 'paths']
            ```
            !!! failure "Conclusion: Invalid `method` parameter."
            </div>
    
        ???+ info "Notes"
    
            ???+ info "Options available in the `method` parameter"
    
                The options available in the `method` parameter include:
    
                - If the objects on both the left-hand side and the right-hand side are both `dataframes` already loaded to memory, use one of the following values:
                    <div class="mdx-three-columns" markdown>
                    - `#!py "table"`
                    - `#!py "table_table"`
                    - `#!py "tables"`
                    - `#!py "by_table"`
                    - `#!py "by_table_and_table"`
                    - `#!py "table_and_table"`
                    </div>
                - If the object on the left-hand side is a `dataframe` already loaded to memory, but the object on the right-hand side is a table sitting on a path somewhere, use one of the following values:
                    <div class="mdx-three-columns" markdown>
                    - `#!py "table_and_path"`
                    - `#!py "table_path"`
                    - `#!py "by_table_and_path"`
                    </div>
                - If the object on the left-hand side is a table sitting on a path somewhere, but the object on the right-hand side is a `dataframe` already loaded to memory, use one of the following values:
                    <div class="mdx-three-columns" markdown>
                    - `#!py "path_and_table"`
                    - `#!py "path_table"`
                    - `#!py "by_path_and_table"`
                    </div>
                - If the objects on both the left-hand side and the right-hand side are both tables sitting on a path somewhere, then use one of the following values:
                    <div class="mdx-three-columns" markdown>
                    - `#!py "path_and_path"`
                    - `#!py "path_path"`
                    - `#!py "by_path_and_path"`
                    - `#!py "path"`
                    - `#!py "paths"`
                    </div>
    
            ???+ info "Details about the return object when we set the parameter `#!py return_object="results"`"
    
                - When we set the parameter `#!py return_object="results"`, then we will get an object returned from this function.
                - That object will be a `#!py list` of `#!py tuple`'s, each `#!py tuple` is only two-elements long, where the first element is a `#!py str` object, and the second is a `#!py dict` where the keys are `#!py str` and the values are a `#!py StructField` object.
                - For each of the `#!py tuple` elements, the first element (the `#!py str` object) describes what the `#!py tuple` is there for. It will be one of four words: `#!py "add"`, `#!py "remove"`, `#!py "change_type"`, or `#!py "change_nullable"`.
                - You can change whether these options are included in the schema check by changing the other parameters: `#!py include_change_field`, `#!py include_add_field`, `#!py include_remove_field`, `#!py include_change_nullable`.
                - The structure of the list will look like this:
    
                ```{.py .python .title="The structure of the returned object"}
                [
                    (
                        "add",  # (1)!
                        {"left": T.StructField("e", T.StringType(), False)},  # (2)!
                    ),
                    (
                        "add",  # (3)!
                        {"left": T.StructField("h", T.StringType(), False)},
                    ),
                    (
                        "remove",  # (4)!
                        {"right": T.StructField("g", T.StringType(), False)},  # (5)!
                    ),
                    (
                        "change_type",  # (6)!
                        {
                            "left": T.StructField("c", T.StringType(), False),  # (7)!
                            "right": T.StructField("c", T.IntegerType(), True),
                        },
                    ),
                    (
                        "change_nullable",  # (8)!
                        {
                            "left": T.StructField("c", T.StringType(), False),  # (9)!
                            "right": T.StructField("c", T.IntegerType(), True),
                        },
                    ),
                ]
                ```
    
                1. When `#!py include_add_field=True`, then the `add` section will always appear first.<br>
                    If `#!py include_add_field=False`, then this section is omitted.
                2. The second element of the `#!py tuple` is a `#!py dict` that has only one `key`-`value` pair.<br>
                    The `key` will _always_ be the value `#!py "left"`, because these are fields which have been added to the table on the left-hand side and not found on the right-hand side.
                3. When there are multiple fields added to the table on the left-hand side, they will appear like this.
                4. When `#!py include_remove_field=True`, then the `remove` section will always appear next.<br>
                    If `#!py include_remove_field=False`, then this section is omitted.
                5. The second element of the `#!py tuple` is a `#!py dict` that has only one `key`-`value` pair.<br>
                    The `key` will _always_ be the value `#!py "right"`, because these are fields which have been removed from the left-hand side and only visible on the right-hand side.
                6. When `#!py include_change_field=True`, then the `change_type` section will always appear next.<br>
                    If `#!py include_change_field=False`, then this section is omitted.
                7. The second element of the `#!py tuple` is a `#!py dict` that has two `key`-`value` pairs.<br>
                    The `key`'s will _always_ be the values `#!py "left"` then `#!py "right"`, because these are fields where the data type has changed between the left-hand side and the right-hand side, and therefore you need to see both to see exactly what has changed.
                8. When `#!py include_change_nullable=True`, then the `change_nullable` section will always appear next.<br>
                    If `#!py include_change_nullable=False`, then this section is omitted.
                9. The sectond element of the `#!py tuple` is a `#!py dict` that has two `key`-`value` pairs.<br>
                    The `key`'s will _always_ be the values `#!py "left"` then `#!py "right"`, because these are fields where the nullability of the firlds are changed between the left-hand side and the right-hand side, and therefore you need to see both to see exactly what has changed.
        """
    
        valid_methods = ValidMethods()
        msg: str = "If using the '{meth}' method, then '{name}' cannot be 'None'."
    
        if method in valid_methods.by_table_and_table:
            assert left_table is not None, msg.format(meth=method, name="left_table")
            assert right_table is not None, msg.format(meth=method, name="right_table")
            return _check_schemas_match_by_table_and_table(
                left_table=left_table,
                right_table=right_table,
                include_change_field=include_change_field,
                include_add_field=include_add_field,
                include_remove_field=include_remove_field,
                include_change_nullable=include_change_nullable,
                return_object=return_object,
            )
        elif method in valid_methods.by_table_and_path:
            assert left_table is not None, msg.format(meth=method, name="left_table")
            assert right_table_path is not None, msg.format(meth=method, name="right_table_path")
            assert right_table_name is not None, msg.format(meth=method, name="right_table_name")
            assert spark_session is not None, msg.format(meth=method, name="spark_session")
            return _check_schemas_match_by_table_and_path(
                left_table=left_table,
                right_table_path=right_table_path,
                right_table_name=right_table_name,
                right_table_format=right_table_format,
                spark_session=spark_session,
                include_change_field=include_change_field,
                include_add_field=include_add_field,
                include_remove_field=include_remove_field,
                include_change_nullable=include_change_nullable,
                return_object=return_object,
            )
        elif method in valid_methods.by_path_and_table:
            assert left_table_path is not None, msg.format(meth=method, name="left_table_path")
            assert left_table_name is not None, msg.format(meth=method, name="left_table_name")
            assert right_table is not None, msg.format(meth=method, name="right_table")
            assert spark_session is not None, msg.format(meth=method, name="spark_session")
            return _check_schemas_match_by_path_and_table(
                left_table_path=left_table_path,
                left_table_name=left_table_name,
                right_table=right_table,
                spark_session=spark_session,
                left_table_format=left_table_format,
                include_change_field=include_change_field,
                include_add_field=include_add_field,
                include_remove_field=include_remove_field,
                include_change_nullable=include_change_nullable,
                return_object=return_object,
            )
        elif method in valid_methods.by_path_and_path:
            assert left_table_path is not None, msg.format(meth=method, name="left_table_path")
            assert left_table_name is not None, msg.format(meth=method, name="left_table_name")
            assert right_table_path is not None, msg.format(meth=method, name="right_table_path")
            assert right_table_name is not None, msg.format(meth=method, name="right_table_name")
            assert spark_session is not None, msg.format(meth=method, name="spark_session")
            return _check_schemas_match_by_path_and_path(
                left_table_path=left_table_path,
                left_table_name=left_table_name,
                left_table_format=left_table_format,
                right_table_path=right_table_path,
                right_table_name=right_table_name,
                right_table_format=right_table_format,
                spark_session=spark_session,
                include_change_field=include_change_field,
                include_add_field=include_add_field,
                include_remove_field=include_remove_field,
                include_change_nullable=include_change_nullable,
                return_object=return_object,
            )
        else:
            raise AttributeError(
                f"Invalid value for `method`: '{method}'\n"
                f"Please use one of the following options:\n"
                f"- For `by_table_and_table`, use one of: {valid_methods.by_table_and_table}\n"
                f"- For `by_table_and_path`, use one of: {valid_methods.by_table_and_path}\n"
                f"- For `by_path_and_table`, use one of: {valid_methods.by_path_and_table}\n"
                f"- For `by_path_and_path`, use one of: {valid_methods.by_path_and_path}\n"
            )
    

    view_schema_differences 🔗

    view_schema_differences(
        method: str = "by_table_and_table",
        spark_session: Optional[SparkSession] = None,
        left_table: Optional[psDataFrame] = None,
        left_table_path: Optional[str] = None,
        left_table_name: Optional[str] = None,
        left_table_format: str = "delta",
        right_table: Optional[psDataFrame] = None,
        right_table_path: Optional[str] = None,
        right_table_name: Optional[str] = None,
        right_table_format: str = "delta",
        include_change_field: bool = True,
        include_add_field: bool = True,
        include_remove_field: bool = True,
        include_change_nullable: bool = False,
        view_type: Literal[
            "print", "pprint", "return"
        ] = "pprint",
    ) -> Optional[
        Union[list[tuple[str, dict[str, StructField]]], bool]
    ]
    

    Summary

    View the schemas between two different tables.

    Details

    The primary differences between check_schemas_match() and view_schema_differences() is that check_...() returns either a bool result, or the actual details of the schema differences, whilst view_...() may also return the actual details object, but it will also print the result to the terminal for you to review.
    For full details of all the parameters and all the options, including nuances and detailed explanations and thorough examples, please check the check_schemas_match() function.

    Parameters:

    Name Type Description Default
    method str

    The method to use for the comparison. That is, is either side a table in memory or is it a table sitting on a path?. Check the Notes section for all options available for this parameter.
    Defaults to "by_table_and_table".

    'by_table_and_table'
    spark_session Optional[SparkSession]

    The SparkSession to use if either the left or right tables are sitting on a path somewhere.
    Defaults to None.

    None
    left_table Optional[DataFrame]

    If method defines the left table as a table, then this parameter is the actual dataframe to do the checking against.
    Defaults to None.

    None
    left_table_path Optional[str]

    If method defines the left table as a path, then this parameter is the actual path location where the table can be found.
    Defaults to None.

    None
    left_table_name Optional[str]

    If method defines the left table as a path, then this parameter is the name of the table found at the given left_table_path location.
    Defaults to None.

    None
    left_table_format str

    If method defines the left table as a path, then this parameter is the format of the table found at the given left_table_path location.
    Defaults to "delta".

    'delta'
    right_table Optional[DataFrame]

    If method defines the right table as a table, then this parameter is the actual dataframe to do the checking against.
    Defaults to None.

    None
    right_table_path Optional[str]

    If method defines the right table as a path, then this parameter is the actual path location where the table can be found.
    Defaults to None.

    None
    right_table_name Optional[str]

    If method defines the right table as a path, then this parameter is the name of the table found at the given right_table_path location.
    Defaults to None.

    None
    right_table_format str

    If method defines the right table as a path, then this parameter is the format of the table found at the given right_table_path location.
    Defaults to "delta".

    'delta'
    include_change_field bool

    When doing the schema validations, do you want to include any fields where the data-type on the right-hand side is different from the left-hand side?
    This can be read as: "What fields have had their data type changed between the left-hand side and the right-hand side?"
    Defaults to True.

    True
    include_add_field bool

    When doing the schema validations, do you want to include any fields that have had any additional fields added to the left-hand side, when compared to the right-hand side?
    This can be read as: "What fields have been added to the left-hand side?"
    Defaults to True.

    True
    include_remove_field bool

    When doing the schema validations, do you want to include any fields which are missing from the left-hand side and only existing on the right-hand side?
    This can be read as: "What fields been removed from the left-hand side?"
    Defaults to True.

    True
    include_change_nullable bool

    When doing the schema validations, do you want to include any fields which have had their nullability metadata changed on the right-hand side, when compared to the left-hand side?.
    This can be read as: "What fields had their nullability changed between the left-hand side and the right-hand side?"
    Defaults to False.

    False
    view_type Literal['print', 'pprint', 'return']

    When returning the output from this function, how do you want it to be displayed? Must be one of ["print", "pprint", "return"].
    Logically, the difference is that "print" will display a text value to the terminal that is not formatted in any way; "pprint" will display a pretty-printed text value to the terminal; and "return" will return the schema differences which can then be assigned to another variable.
    Defaults to "pprint".

    'pprint'

    Raises:

    Type Description
    TypeError

    If any of the inputs parsed to the parameters of this function are not the correct type. Uses the @typeguard.typechecked decorator.

    AttributeError

    If the value parse'd to method is not a valid option.

    Returns:

    Type Description
    Optional[list[tuple[str, dict[str, StructField]]]]

    If view_type="return", then this will be a list of tuple's of dict's containing the details of the schema differences. If view_type!="return" (or if view_type="return", but there are actually no differences in the schema), then nothing is returned; only printed to terminal.

    Examples

    Set up
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    >>> # Imports
    >>> from pprint import pprint
    >>> import pandas as pd
    >>> from pyspark.sql import SparkSession, functions as F
    >>> from toolbox_pyspark.schema import view_schema_differences
    >>> from toolbox_pyspark.io import write_to_path
    >>> from toolbox_pyspark.checks import table_exists
    >>>
    >>> # Instantiate Spark
    >>> spark = SparkSession.builder.getOrCreate()
    >>>
    >>> # Create data
    >>> df1 = spark.createDataFrame(
    ...     pd.DataFrame(
    ...         {
    ...             "a": [0, 1, 2, 3],
    ...             "b": ["a", "b", "c", "d"],
    ...             "c": ["1", "1", "1", "1"],
    ...             "d": ["2", "2", "2", "2"],
    ...             "e": ["3", "3", "3", "3"],
    ...             "f": ["4", "4", "4", "4"],
    ...         }
    ...     )
    ... )
    >>> df2 = (
    ...     df1.withColumn("c", F.col("c").cast("int"))
    ...     .withColumn("g", F.lit("a"))
    ...     .withColumn("d", F.lit("null"))
    ...     .drop("e")
    ... )
    >>> write_to_path(
    ...     table=df1,
    ...     name="left",
    ...     path="./test",
    ...     data_format="parquet",
    ...     mode="overwrite",
    ...     write_options={"overwriteSchema": "true"},
    ... )
    >>> write_to_path(
    ...     table=df2,
    ...     name="right",
    ...     path="./test",
    ...     data_format="parquet",
    ...     mode="overwrite",
    ...     write_options={"overwriteSchema": "true"},
    ... )
    >>>
    >>> # Check
    >>> pprint(df1.dtypes)
    >>> print(df1.show())
    >>> print(table_exists("left", "./test", "parquet", spark))
    >>> pprint(df2.dtypes)
    >>> print(df2.show())
    >>> print(table_exists("right", "./test", "parquet", spark))
    
    Terminal
    [
        ("a", "bigint"),
        ("b", "string"),
        ("c", "string"),
        ("d", "string"),
        ("e", "string"),
        ("f", "string"),
    ]
    
    Terminal
    +---+---+---+---+---+---+
    | a | b | c | d | e | f |
    +---+---+---+---+---+---+
    | 0 | a | 1 | 2 | 3 | 4 |
    | 1 | b | 1 | 2 | 3 | 4 |
    | 2 | c | 1 | 2 | 3 | 4 |
    | 3 | d | 1 | 2 | 3 | 4 |
    +---+---+---+---+---+---+
    
    Terminal
    True
    
    Terminal
    [
        ("a", "bigint"),
        ("b", "string"),
        ("c", "int"),
        ("d", "string"),
        ("f", "string"),
        ("g", "string"),
    ]
    
    Terminal
    +---+---+---+------+---+---+
    | a | b | c |    d | f | g |
    +---+---+---+------+---+---+
    | 0 | a | 1 | null | 4 | 2 |
    | 1 | b | 1 | null | 4 | 2 |
    | 2 | c | 1 | null | 4 | 2 |
    | 3 | d | 1 | null | 4 | 2 |
    +---+---+---+------+---+---+
    
    Terminal
    True
    

    Example 1: Check matching
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    >>> view_schema_differences(
    ...     method="table_table",
    ...     left_table=df1,
    ...     right_table=df1,
    ...     include_add_field=True,
    ...     include_change_field=True,
    ...     include_remove_field=True,
    ...     include_change_nullable=True,
    ...     view_type="return",
    ... )
    >>> print(diff)
    
    Terminal
    None
    

    Conclusion: Schemas match.

    Example 2: Check print
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    >>> view_schema_differences(
    ...     method="table_table",
    ...     left_table=df1,
    ...     right_table=df2,
    ...     include_add_field=True,
    ...     include_change_field=False,
    ...     include_remove_field=False,
    ...     include_change_nullable=False,
    ...     view_type="print",
    ... )
    
    Terminal
    [('add', {'left': StructField('e', StringType(), True)})]
    

    Conclusion: Schemas do not match because the e field was added.

    Example 3: Check pprint
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    >>> view_schema_differences(
    ...     method="table_table",
    ...     left_table=df1,
    ...     right_table=df2,
    ...     include_add_field=True,
    ...     include_change_field=True,
    ...     include_remove_field=True,
    ...     include_change_nullable=True,
    ...     view_type="pprint",
    ... )
    
    Terminal
    [('add', {'left': StructField('e', StringType(), False)}),
     ('remove', {'right': StructField('g', StringType(), False)}),
     ('change_type',
      {'left': StructField('c', StringType(), False),
       'right': StructField('c', IntegerType(), True)}),
     ('change_nullable',
      {'left': StructField('c', StringType(), False),
       'right': StructField('c', IntegerType(), True)})]
    

    Conclusion: Schemas do not match because the e field was added, the g field was removed, the c field had its data type changed, and the c field had its nullability changed.

    Example 4: Check with right-hand side as a `path`
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    >>> view_schema_differences(
    ...     method="table_table",
    ...     spark_session=spark,
    ...     left_table=df1,
    ...     right_table_path="./test",
    ...     right_table_name="right",
    ...     right_table_format="parquet",
    ...     include_add_field=True,
    ...     include_change_field=False,
    ...     include_remove_field=False,
    ...     include_change_nullable=False,
    ...     view_type="pprint",
    ... )
    
    Terminal
    [('add', {'left': StructField('e', StringType(), True)})]
    

    Conclusion: Schemas do not match because the e field was added.

    Example 5: Check with both sides being a `path`
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    >>> view_schema_differences(
    ...     method="table_table",
    ...     spark_session=spark,
    ...     left_table_path="./test",
    ...     left_table_name="left",
    ...     left_table_format="parquet",
    ...     right_table_path="./test",
    ...     right_table_name="right",
    ...     right_table_format="parquet",
    ...     include_add_field=False,
    ...     include_change_field=False,
    ...     include_remove_field=True,
    ...     include_change_nullable=False,
    ...     view_type="pprint",
    ... )
    
    Terminal
    [('remove', {'right': StructField('g', StringType(), True)})]
    

    Conclusion: Schemas do not match because the g field was removed.

    Example 6: Invalid `method` parameter
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    >>> view_schema_differences(
    ...     method="table_table_table",
    ...     left_table=df1,
    ...     right_table=df2,
    ...     include_add_field=True,
    ...     include_change_field=True,
    ...     include_remove_field=True,
    ...     include_change_nullable=True,
    ...     view_type="return",
    ... )
    
    Terminal
    AttributeError: Invalid value for `method`: 'table_table_table'
    Please use one of the following options:
    - For `by_table_and_table`, use one of the following values: ['table', 'table_table', 'tables', 'by_table', 'by_table_and_table', 'table_and_table']
    - For `by_table_and_path`, use one of the following values: ['table_and_path', 'table_path', 'by_table_and_path']
    - For `by_path_and_table`, use one of the following values: ['path_and_table', 'path_table', 'by_path_and_table']
    - For `by_path_and_path`, use one of the following values: ['path_and_path', 'path_path', 'by_path_and_path', 'path', 'paths']
    

    Conclusion: Invalid method parameter.

    See Also
    Source code in src/toolbox_pyspark/schema.py
    1038
    1039
    1040
    1041
    1042
    1043
    1044
    1045
    1046
    1047
    1048
    1049
    1050
    1051
    1052
    1053
    1054
    1055
    1056
    1057
    1058
    1059
    1060
    1061
    1062
    1063
    1064
    1065
    1066
    1067
    1068
    1069
    1070
    1071
    1072
    1073
    1074
    1075
    1076
    1077
    1078
    1079
    1080
    1081
    1082
    1083
    1084
    1085
    1086
    1087
    1088
    1089
    1090
    1091
    1092
    1093
    1094
    1095
    1096
    1097
    1098
    1099
    1100
    1101
    1102
    1103
    1104
    1105
    1106
    1107
    1108
    1109
    1110
    1111
    1112
    1113
    1114
    1115
    1116
    1117
    1118
    1119
    1120
    1121
    1122
    1123
    1124
    1125
    1126
    1127
    1128
    1129
    1130
    1131
    1132
    1133
    1134
    1135
    1136
    1137
    1138
    1139
    1140
    1141
    1142
    1143
    1144
    1145
    1146
    1147
    1148
    1149
    1150
    1151
    1152
    1153
    1154
    1155
    1156
    1157
    1158
    1159
    1160
    1161
    1162
    1163
    1164
    1165
    1166
    1167
    1168
    1169
    1170
    1171
    1172
    1173
    1174
    1175
    1176
    1177
    1178
    1179
    1180
    1181
    1182
    1183
    1184
    1185
    1186
    1187
    1188
    1189
    1190
    1191
    1192
    1193
    1194
    1195
    1196
    1197
    1198
    1199
    1200
    1201
    1202
    1203
    1204
    1205
    1206
    1207
    1208
    1209
    1210
    1211
    1212
    1213
    1214
    1215
    1216
    1217
    1218
    1219
    1220
    1221
    1222
    1223
    1224
    1225
    1226
    1227
    1228
    1229
    1230
    1231
    1232
    1233
    1234
    1235
    1236
    1237
    1238
    1239
    1240
    1241
    1242
    1243
    1244
    1245
    1246
    1247
    1248
    1249
    1250
    1251
    1252
    1253
    1254
    1255
    1256
    1257
    1258
    1259
    1260
    1261
    1262
    1263
    1264
    1265
    1266
    1267
    1268
    1269
    1270
    1271
    1272
    1273
    1274
    1275
    1276
    1277
    1278
    1279
    1280
    1281
    1282
    1283
    1284
    1285
    1286
    1287
    1288
    1289
    1290
    1291
    1292
    1293
    1294
    1295
    1296
    1297
    1298
    1299
    1300
    1301
    1302
    1303
    1304
    1305
    1306
    1307
    1308
    1309
    1310
    1311
    1312
    1313
    1314
    1315
    1316
    1317
    1318
    1319
    1320
    1321
    1322
    1323
    1324
    1325
    1326
    1327
    1328
    1329
    1330
    1331
    1332
    1333
    1334
    1335
    1336
    1337
    1338
    1339
    1340
    1341
    1342
    1343
    1344
    1345
    1346
    1347
    1348
    1349
    1350
    1351
    1352
    1353
    1354
    1355
    1356
    1357
    1358
    1359
    1360
    1361
    1362
    1363
    1364
    1365
    1366
    1367
    1368
    1369
    1370
    1371
    1372
    1373
    1374
    1375
    1376
    1377
    1378
    1379
    1380
    1381
    1382
    1383
    1384
    1385
    1386
    1387
    1388
    1389
    1390
    1391
    1392
    1393
    1394
    1395
    1396
    1397
    1398
    1399
    1400
    1401
    1402
    1403
    1404
    1405
    1406
    1407
    1408
    1409
    1410
    1411
    1412
    1413
    1414
    1415
    1416
    1417
    1418
    1419
    1420
    1421
    1422
    1423
    1424
    1425
    1426
    1427
    1428
    1429
    1430
    1431
    1432
    1433
    1434
    1435
    1436
    1437
    1438
    1439
    1440
    1441
    1442
    1443
    1444
    1445
    1446
    1447
    1448
    1449
    @typechecked
    def view_schema_differences(
        method: str = "by_table_and_table",
        spark_session: Optional[SparkSession] = None,
        left_table: Optional[psDataFrame] = None,
        left_table_path: Optional[str] = None,
        left_table_name: Optional[str] = None,
        left_table_format: str = "delta",
        right_table: Optional[psDataFrame] = None,
        right_table_path: Optional[str] = None,
        right_table_name: Optional[str] = None,
        right_table_format: str = "delta",
        include_change_field: bool = True,
        include_add_field: bool = True,
        include_remove_field: bool = True,
        include_change_nullable: bool = False,
        view_type: Literal["print", "pprint", "return"] = "pprint",
    ) -> Optional[Union[list[tuple[str, dict[str, StructField]]], bool]]:
        """
        !!! note "Summary"
            View the schemas between two different tables.
    
        ???+ abstract "Details"
            The primary differences between [`check_schemas_match()`][toolbox_pyspark.schema.check_schemas_match] and [`view_schema_differences()`][toolbox_pyspark.schema.view_schema_differences] is that [`check_...()`][toolbox_pyspark.schema.check_schemas_match] returns either a `#!py bool` result, or the actual details of the schema differences, whilst [`view_...()`][toolbox_pyspark.schema.view_schema_differences] may also return the actual details object, but it will also print the result to the terminal for you to review.<br>
            For full details of all the parameters and all the options, including nuances and detailed explanations and thorough examples, please check the [`check_schemas_match()`][toolbox_pyspark.schema.check_schemas_match] function.
    
        Params:
            method (str, optional):
                The method to use for the comparison. That is, is either side a table in memory or is it a `table` sitting on a `path`?. Check the Notes section for all options available for this parameter.<br>
                Defaults to `#!py "by_table_and_table"`.
            spark_session (Optional[SparkSession], optional):
                The `SparkSession` to use if either the `left` or `right` tables are sitting on a `path` somewhere.<br>
                Defaults to `#!py None`.
            left_table (Optional[psDataFrame], optional):
                If `method` defines the `left` table as a `table`, then this parameter is the actual `dataframe` to do the checking against.<br>
                Defaults to `#!py None`.
            left_table_path (Optional[str], optional):
                If `method` defines the `left` table as a `path`, then this parameter is the actual path location where the table can be found.<br>
                Defaults to `#!py None`.
            left_table_name (Optional[str], optional):
                If `method` defines the `left` table as a `path`, then this parameter is the name of the table found at the given `left_table_path` location.<br>
                Defaults to `#!py None`.
            left_table_format (str, optional):
                If `method` defines the `left` table as a `path`, then this parameter is the format of the table found at the given `left_table_path` location.<br>
                Defaults to `#!py "delta"`.
            right_table (Optional[psDataFrame], optional):
                If `method` defines the `right` table as a `table`, then this parameter is the actual `dataframe` to do the checking against.<br>
                Defaults to `#!py None`.
            right_table_path (Optional[str], optional):
                If `method` defines the `right` table as a `path`, then this parameter is the actual path location where the table can be found.<br>
                Defaults to `#!py None`.
            right_table_name (Optional[str], optional):
                If `method` defines the `right` table as a `path`, then this parameter is the name of the table found at the given `right_table_path` location.<br>
                Defaults to `#!py None`.
            right_table_format (str, optional):
                If `method` defines the `right` table as a `path`, then this parameter is the format of the table found at the given `right_table_path` location.<br>
                Defaults to `#!py "delta"`.
            include_change_field (bool, optional):
                When doing the schema validations, do you want to include any fields where the data-type on the right-hand side is different from the left-hand side?<br>
                This can be read as: "What fields have had their data type _changed **between**_ the left-hand side and the right-hand side?"<br>
                Defaults to `#!py True`.
            include_add_field (bool, optional):
                When doing the schema validations, do you want to include any fields that have had any additional fields added to the left-hand side, when compared to the right-hand side?<br>
                This can be read as: "What fields have been _added **to**_ the left-hand side?"<br>
                Defaults to `#!py True`.
            include_remove_field (bool, optional):
                When doing the schema validations, do you want to include any fields which are missing from the left-hand side and only existing on the right-hand side?<br>
                This can be read as: "What fields been _removed **from**_ the left-hand side?"<br>
                Defaults to `#!py True`.
            include_change_nullable (bool, optional):
                When doing the schema validations, do you want to include any fields which have had their nullability metadata changed on the right-hand side, when compared to the left-hand side?.<br>
                This can be read as: "What fields had their nullability _changed **between**_ the left-hand side and the right-hand side?"<br>
                Defaults to `#!py False`.
            view_type (Literal["print", "pprint", "return"], optional):
                When returning the output from this function, how do you want it to be displayed? Must be one of `#!py ["print", "pprint", "return"]`.<br>
                Logically, the difference is that `#!py "print"` will display a text value to the terminal that is not formatted in any way; `#!py "pprint"` will display a pretty-printed text value to the terminal; and `#!py "return"` will return the schema differences which can then be assigned to another variable.<br>
                Defaults to `#!py "pprint"`.
    
        Raises:
            TypeError:
                If any of the inputs parsed to the parameters of this function are not the correct type. Uses the [`@typeguard.typechecked`](https://typeguard.readthedocs.io/en/stable/api.html#typeguard.typechecked) decorator.
            AttributeError:
                If the value parse'd to `method` is not a valid option.
    
        Returns:
            (Optional[list[tuple[str, dict[str, StructField]]]]):
                If `#!py view_type="return"`, then this will be a `#!py list` of `#!py tuple`'s of `#!py dict`'s containing the details of the schema differences. If `#!py view_type!="return"` (or if `#!py view_type="return"`, but there are actually no differences in the schema), then nothing is returned; only printed to terminal.
    
        ???+ example "Examples"
    
            ```{.py .python linenums="1" title="Set up"}
            >>> # Imports
            >>> from pprint import pprint
            >>> import pandas as pd
            >>> from pyspark.sql import SparkSession, functions as F
            >>> from toolbox_pyspark.schema import view_schema_differences
            >>> from toolbox_pyspark.io import write_to_path
            >>> from toolbox_pyspark.checks import table_exists
            >>>
            >>> # Instantiate Spark
            >>> spark = SparkSession.builder.getOrCreate()
            >>>
            >>> # Create data
            >>> df1 = spark.createDataFrame(
            ...     pd.DataFrame(
            ...         {
            ...             "a": [0, 1, 2, 3],
            ...             "b": ["a", "b", "c", "d"],
            ...             "c": ["1", "1", "1", "1"],
            ...             "d": ["2", "2", "2", "2"],
            ...             "e": ["3", "3", "3", "3"],
            ...             "f": ["4", "4", "4", "4"],
            ...         }
            ...     )
            ... )
            >>> df2 = (
            ...     df1.withColumn("c", F.col("c").cast("int"))
            ...     .withColumn("g", F.lit("a"))
            ...     .withColumn("d", F.lit("null"))
            ...     .drop("e")
            ... )
            >>> write_to_path(
            ...     table=df1,
            ...     name="left",
            ...     path="./test",
            ...     data_format="parquet",
            ...     mode="overwrite",
            ...     write_options={"overwriteSchema": "true"},
            ... )
            >>> write_to_path(
            ...     table=df2,
            ...     name="right",
            ...     path="./test",
            ...     data_format="parquet",
            ...     mode="overwrite",
            ...     write_options={"overwriteSchema": "true"},
            ... )
            >>>
            >>> # Check
            >>> pprint(df1.dtypes)
            >>> print(df1.show())
            >>> print(table_exists("left", "./test", "parquet", spark))
            >>> pprint(df2.dtypes)
            >>> print(df2.show())
            >>> print(table_exists("right", "./test", "parquet", spark))
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [
                ("a", "bigint"),
                ("b", "string"),
                ("c", "string"),
                ("d", "string"),
                ("e", "string"),
                ("f", "string"),
            ]
            ```
            ```{.txt .text title="Terminal"}
            +---+---+---+---+---+---+
            | a | b | c | d | e | f |
            +---+---+---+---+---+---+
            | 0 | a | 1 | 2 | 3 | 4 |
            | 1 | b | 1 | 2 | 3 | 4 |
            | 2 | c | 1 | 2 | 3 | 4 |
            | 3 | d | 1 | 2 | 3 | 4 |
            +---+---+---+---+---+---+
            ```
            ```{.sh .shell title="Terminal"}
            True
            ```
            ```{.sh .shell title="Terminal"}
            [
                ("a", "bigint"),
                ("b", "string"),
                ("c", "int"),
                ("d", "string"),
                ("f", "string"),
                ("g", "string"),
            ]
            ```
            ```{.txt .text title="Terminal"}
            +---+---+---+------+---+---+
            | a | b | c |    d | f | g |
            +---+---+---+------+---+---+
            | 0 | a | 1 | null | 4 | 2 |
            | 1 | b | 1 | null | 4 | 2 |
            | 2 | c | 1 | null | 4 | 2 |
            | 3 | d | 1 | null | 4 | 2 |
            +---+---+---+------+---+---+
            ```
            ```{.sh .shell title="Terminal"}
            True
            ```
            </div>
    
            ```{.py .python linenums="1" title="Example 1: Check matching"}
            >>> view_schema_differences(
            ...     method="table_table",
            ...     left_table=df1,
            ...     right_table=df1,
            ...     include_add_field=True,
            ...     include_change_field=True,
            ...     include_remove_field=True,
            ...     include_change_nullable=True,
            ...     view_type="return",
            ... )
            >>> print(diff)
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            None
            ```
            !!! success "Conclusion: Schemas match."
            </div>
    
            ```{.py .python linenums="1" title="Example 2: Check print"}
            >>> view_schema_differences(
            ...     method="table_table",
            ...     left_table=df1,
            ...     right_table=df2,
            ...     include_add_field=True,
            ...     include_change_field=False,
            ...     include_remove_field=False,
            ...     include_change_nullable=False,
            ...     view_type="print",
            ... )
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [('add', {'left': StructField('e', StringType(), True)})]
            ```
            !!! failure "Conclusion: Schemas do not match because the `e` field was added."
            </div>
    
            ```{.py .python linenums="1" title="Example 3: Check pprint"}
            >>> view_schema_differences(
            ...     method="table_table",
            ...     left_table=df1,
            ...     right_table=df2,
            ...     include_add_field=True,
            ...     include_change_field=True,
            ...     include_remove_field=True,
            ...     include_change_nullable=True,
            ...     view_type="pprint",
            ... )
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [('add', {'left': StructField('e', StringType(), False)}),
             ('remove', {'right': StructField('g', StringType(), False)}),
             ('change_type',
              {'left': StructField('c', StringType(), False),
               'right': StructField('c', IntegerType(), True)}),
             ('change_nullable',
              {'left': StructField('c', StringType(), False),
               'right': StructField('c', IntegerType(), True)})]
            ```
            !!! failure "Conclusion: Schemas do not match because the `e` field was added, the `g` field was removed, the `c` field had its data type changed, and the `c` field had its nullability changed."
            </div>
    
            ```{.py .python linenums="1" title="Example 4: Check with right-hand side as a `path`"}
            >>> view_schema_differences(
            ...     method="table_table",
            ...     spark_session=spark,
            ...     left_table=df1,
            ...     right_table_path="./test",
            ...     right_table_name="right",
            ...     right_table_format="parquet",
            ...     include_add_field=True,
            ...     include_change_field=False,
            ...     include_remove_field=False,
            ...     include_change_nullable=False,
            ...     view_type="pprint",
            ... )
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [('add', {'left': StructField('e', StringType(), True)})]
            ```
            !!! failure "Conclusion: Schemas do not match because the `e` field was added."
            </div>
    
            ```{.py .python linenums="1" title="Example 5: Check with both sides being a `path`"}
            >>> view_schema_differences(
            ...     method="table_table",
            ...     spark_session=spark,
            ...     left_table_path="./test",
            ...     left_table_name="left",
            ...     left_table_format="parquet",
            ...     right_table_path="./test",
            ...     right_table_name="right",
            ...     right_table_format="parquet",
            ...     include_add_field=False,
            ...     include_change_field=False,
            ...     include_remove_field=True,
            ...     include_change_nullable=False,
            ...     view_type="pprint",
            ... )
            ```
            <div class="result" markdown>
            ```{.sh .shell title="Terminal"}
            [('remove', {'right': StructField('g', StringType(), True)})]
            ```
            !!! failure "Conclusion: Schemas do not match because the `g` field was removed."
            </div>
    
            ```{.py .python linenums="1" title="Example 6: Invalid `method` parameter"}
            >>> view_schema_differences(
            ...     method="table_table_table",
            ...     left_table=df1,
            ...     right_table=df2,
            ...     include_add_field=True,
            ...     include_change_field=True,
            ...     include_remove_field=True,
            ...     include_change_nullable=True,
            ...     view_type="return",
            ... )
            ```
            <div class="result" markdown>
            ```{.sh .shell  title="Terminal"}
            AttributeError: Invalid value for `method`: 'table_table_table'
            Please use one of the following options:
            - For `by_table_and_table`, use one of the following values: ['table', 'table_table', 'tables', 'by_table', 'by_table_and_table', 'table_and_table']
            - For `by_table_and_path`, use one of the following values: ['table_and_path', 'table_path', 'by_table_and_path']
            - For `by_path_and_table`, use one of the following values: ['path_and_table', 'path_table', 'by_path_and_table']
            - For `by_path_and_path`, use one of the following values: ['path_and_path', 'path_path', 'by_path_and_path', 'path', 'paths']
            ```
            !!! failure "Conclusion: Invalid `method` parameter."
            </div>
    
        ??? tip "See Also"
            - [`check_schemas_match()`][toolbox_pyspark.schema.check_schemas_match]
        """
    
        valid_methods: ValidMethods = ValidMethods()
        msg: str = "If using the '{meth}' method, then '{name}' cannot be 'None'."
    
        if method in valid_methods.by_table_and_table:
            assert left_table is not None, msg.format(meth=method, name="left_table")
            assert right_table is not None, msg.format(meth=method, name="right_table")
            return _view_schema_differences_by_table_and_table(
                left_table=left_table,
                right_table=right_table,
                include_change_field=include_change_field,
                include_add_field=include_add_field,
                include_remove_field=include_remove_field,
                include_change_nullable=include_change_nullable,
                view_type=view_type,
            )
        elif method in valid_methods.by_table_and_path:
            assert left_table is not None, msg.format(meth=method, name="left_table")
            assert right_table_path is not None, msg.format(meth=method, name="right_table_path")
            assert right_table_name is not None, msg.format(meth=method, name="right_table_name")
            assert spark_session is not None, msg.format(meth=method, name="spark_session")
            return _view_schema_differences_by_table_and_path(
                left_table=left_table,
                right_table_path=right_table_path,
                right_table_name=right_table_name,
                right_table_format=right_table_format,
                spark_session=spark_session,
                include_change_field=include_change_field,
                include_add_field=include_add_field,
                include_remove_field=include_remove_field,
                include_change_nullable=include_change_nullable,
                view_type=view_type,
            )
        elif method in valid_methods.by_path_and_table:
            assert left_table_path is not None, msg.format(meth=method, name="left_table_path")
            assert left_table_name is not None, msg.format(meth=method, name="left_table_name")
            assert right_table is not None, msg.format(meth=method, name="right_table")
            assert spark_session is not None, msg.format(meth=method, name="spark_session")
            return _view_schema_differences_by_path_and_table(
                left_table_path=left_table_path,
                left_table_name=left_table_name,
                left_table_format=left_table_format,
                right_table=right_table,
                spark_session=spark_session,
                include_change_field=include_change_field,
                include_add_field=include_add_field,
                include_remove_field=include_remove_field,
                include_change_nullable=include_change_nullable,
                view_type=view_type,
            )
        elif method in valid_methods.by_path_and_path:
            assert left_table_path is not None, msg.format(meth=method, name="left_table_path")
            assert left_table_name is not None, msg.format(meth=method, name="left_table_name")
            assert right_table_path is not None, msg.format(meth=method, name="right_table_path")
            assert right_table_name is not None, msg.format(meth=method, name="right_table_name")
            assert spark_session is not None, msg.format(meth=method, name="spark_session")
            return _view_schema_differences_by_path_and_path(
                left_table_path=left_table_path,
                left_table_name=left_table_name,
                left_table_format=left_table_format,
                right_table_path=right_table_path,
                right_table_name=right_table_name,
                right_table_format=right_table_format,
                spark_session=spark_session,
                include_change_field=include_change_field,
                include_add_field=include_add_field,
                include_remove_field=include_remove_field,
                include_change_nullable=include_change_nullable,
                view_type=view_type,
            )
        else:
            raise AttributeError(
                f"Invalid value for `method`: '{method}'\n"
                f"Please use one of the following options:\n"
                f"- For `by_table_and_table`, use one of: {valid_methods.by_table_and_table}\n"
                f"- For `by_table_and_path`, use one of: {valid_methods.by_table_and_path}\n"
                f"- For `by_path_and_table`, use one of: {valid_methods.by_path_and_table}\n"
                f"- For `by_path_and_path`, use one of: {valid_methods.by_path_and_path}\n"
            )