SQL AST Similarity
Definitions
SQL AST Similarity compares the structure of two SQL queries by analyzing their Abstract Syntax Trees (ASTs). This metric assesses similarity by matching the nodes within these trees, taking into account the statement types and their arrangement. Different types of tree differences (such as insert, remove, update, move, etc.) are weighted differently to calculate the final similarity score.
Example Usage
Required data items: answer
, ground_truth_answers
You can optionally initialize the metric to use optimized SQL queries using the sqlglot optimizer and optionally pass in the schema. For example:
You can also customize weights to different types of nodes in the AST diff. Higher weights indicate more significant changes, which are expected to have a greater impact on query semantics.