Add structured properties information when using tool get_entity

View original issue on GitHub  ·  Variant 1

Adding Structured Properties to the get_entity Tool

The get_entity tool in mcp-server-datahub is designed to retrieve information about various entities like glossary terms and data products. A user discovered that when using this tool, structured properties for glossary terms and data products were not being returned. This article will guide you through understanding the issue, its root cause, and how to contribute a fix.

The Problem: Missing Structured Properties

The core issue is that the GraphQL query used by the get_entity tool was incomplete. Specifically, it lacked the necessary fields to fetch structured properties for certain entity types, like Glossary Terms and Data Products. This meant that users relying on the tool to retrieve comprehensive information about these entities were missing crucial details.

Root Cause: Incomplete GraphQL Query

The problem stems from an oversight in the initial implementation of the GraphQL query used by the get_entity tool. The query was not designed to include structured properties for all entity types. This omission likely occurred because the initial focus was on retrieving basic entity information, and the need for structured properties was not fully anticipated or prioritized for all entity types.

Solution: Modifying the GraphQL Query

The solution involves modifying the GraphQL query to include the missing structured properties. Here's a step-by-step guide:

  1. Identify the GraphQL Query: Locate the GraphQL query used by the get_entity tool. This is typically within the codebase responsible for handling entity retrieval.
  2. Add the Missing Fields: Modify the query to include the necessary fields for retrieving structured properties. For example, for Glossary Terms, the following snippet shows the required addition:

... on GlossaryTerm {
  hierarchicalName
  properties {
    name
    description
    termSource
    sourceRef
    sourceUrl
    rawSchema
    customProperties {
      key
      value
    }
  }
  structuredProperties {
    properties {
      ...structuredPropertiesFields
    }
  }
  deprecation {
    ...deprecationFields
  }
}
  1. Apply Similar Changes to DataProduct: Repeat the process for Data Products or any other entity type missing structured properties. Ensure you include the appropriate fields relevant to each entity.
  2. Test the Changes: Thoroughly test the modified query to ensure that structured properties are now being returned correctly for all relevant entity types. You can use GraphQL explorers or write integration tests to verify the behavior.
  3. Submit a Pull Request: Once you are confident that the changes are working correctly, submit a pull request to the mcp-server-datahub repository. Be sure to include a clear description of the issue and the solution you have implemented.

Considerations for Associated Entities

During the discussion of this issue, the topic of associated entities for Data Products was raised. While it might be tempting to include all associated entities in the get_entity tool, it's important to consider the impact on performance and LLM context size. As suggested in the community discussion, it might be more efficient to use the search tool to find entities associated with a Data Product, especially if the number of associated entities is large.

Practical Tips

By following these steps and considerations, you can contribute to improving the get_entity tool and ensure that it provides comprehensive information about all entity types in mcp-server-datahub.