Appendix - Templates and Rubrics

Appendix A. Methodology

ciuTshi makes adaptable data operations simple. Despite the size and complexity of this appendix (and the overall standard), building metadata and process value are incremental and cumulative efforts. The challenge of generating rich metadata for modeling and analysis is a team effort, shared by several persons in the organization for which the data provides value. The aim of ciuTshi is to make the connection between the data and its value to the organization clear and valid for its users. What follows is a brief overview of an adaptable methodology for using this standard to gather and generate data asset value metrics.

Verification

ciuTshi starts its data operations journey through clear requirements for a data asset. If your institution does not have a clear need for a data set or collection, nor a good value proposition for using people and resources to process the data into useable endpoint, it is not needed. Begin by reviewing and verifying the requirements with project leads, leadership, and other stakeholders who may provide clarity on the intent and purpose for a data asset request. If there is a clear justification for tasking a data engineering team with acquisition and curation of the data asset, start using dimensions, features, and user stories for the data to conceive of metrics which can validate process value for the institution (e.g., How does the use of this data impact the project and organization? How do I show this impact to leadership, stakeholders, and non-technical personnel?).

Note: This step is where knowledge gaps are also identified. Even if the data is not accessible or is non-existent, an effort should be made to note the gap and what value the data, if present, would provide the organization. This will leave an ontological marker for later data enrichment via proxy data assets and/or metrics.

Selection

Every modular practice in ciuTshi has a template and rubric to facilitate eventual population of metadata for a data asset: creating data provenance for the organization’s metadata lineage. This process is a team effort with several data professionals tasked to action one or more modules for a given data asset. To begin, review the module’s documentation: summary, template, rubric, and any worksheets that assist in populating metadata for the data asset. Each module has roles, each with their own responsibilities and tasks that may depend on or overlap with other modules, tasks, and roles. A number of data professionals and organizational team members facilitate these task and modules by performing one or more roles within and between modules. This approach to task management within data operations allow organizations with variable available personnel to layer and delegate tasks within reasonable deadlines for both delivery of the data asset and population of the metadata.

Note: This represents a situation common for many organizations and is why ciuTshi’s metatdata model is so critical: the connection between metadata and the socio-technical space of data operations is full of critical metrics that lend to quantifying process value in data practices.

Collection

Once requirements are validated and tasks are appropriately assigned, data professionals should collect and populate their portion of the metadata in the process of facilitating their data engineering, management, and governance tasks. All fields in the rubric schema should be filled out as completely as possible. It is understood that you may not use all the fields at first, but it is recommended that you maintain the entire schema as future data requirements may expand the scope of metadata needs. This also ensures that as ciuTshi grows, augmenting your existing metadata collections is simplified.

Additionally, metadata collection should observe existing scientific research practices in population of the metadata rubrics. It is important to not make up data: null elements or empty fields are acceptable inputs if sufficient data is unavailable. It is important that every member of the data operations team agree to the consistent manner in which data is collected and entered into the rubric: noting the agreed method in the content management documents for the data project. It is recommended that this consistent data collection across all modules be rigorously adhered to within and between data projects: the more consistent and rigorous the metadata collection is, and the better shifts in these practices are tracked, the more valid the metric model outputs may be for decision-makers, stakeholders, and customers. This ability to validate metrics with collection practices aids in cross-validated review of those metrics with others throughout the organization.

Note: Not every organization will use the same parts of ciuTshi in the same ways. The key focus should be to perform metadata collection in a consistent, rigorous, and verifiable manner. This level of metadata quality assurance is critical for long-term review and verification of metrics for data operations.

Modeling

In the requirements phase, we began asking critical questions to connect the data asset to its value impact for the project and as a result, the institution. After generating a rich and deep metadata collection for a data asset, we can now start adjusting and adding metrics to our fields for module- and asset-level modeling. Metrics are often organization specific, but may fall in several categories: descriptive, prescriptive, perceptive, interaction, outcome to name a few. Over time, each organization will develop and model their own sets of quantitative and qualitative metrics for their data assets. These metrics should be developed and documented as the organization’s mission, vision, and goals shift: this helps leadership and stakeholders understand how value and impact change based on key indicators found in the metadata. Value should be at these metrics’ loci, but it is by no means to only concern data professionals should have for a data asset. There may be quality, security, ethics, or other ensemble metrics derived from the organization’s members, leadership, stakeholder networks, communities of practice, and other data partners.

There are also universals that exist across all data assets which add richness to the metadata for longitudinal analysis:

  • weight is a default metric that exists across all data elements. Weight can imply several things for a data asset and will be defined by each organization based on their core metric requirements for data lifecycle management (documented in detail via content management), but the initial intent for the metric is to determine how important a metadata field is for the organization (on a scale from 0 to 1). This variable heuristic weight allows each data asset to be shaped by data professionals to reflect their organization’s unique application of data operations for value generation and stakeholder impact.

Note: Outside of data operations, other institutional teams and sections may have metrics that do not adhere to ciuTshi standards for comprehensive metadata collection and modeling. Efforts should be made to account for any connections within these external metrics that have direct implications for data asset impact on overall process value. Despite the technical and complex nature of some metrics, a non-technical team member must be able to understand the output of the model, its value, and its implications for institutional impact.

Proof

Establishment of models requires a persistent level of vigilance to ensure their validity. This validity requires several engagements with the model and its modelers to ensure the metrics prove what they claim to prove. This validation starts with a sanity check: does the model, metadata, metrics, and outputs make sense based on what we know about the data asset, its collection, and its utilization? Validation should then be socialized, opening communication with colleagues to ensure the model makes sense when walked-through the modeling process (observing appropriate security and other ethical research practices). Once internal colleagues review the model and validate its metrics, it may then call for peer-review with stakeholders and other reliable partners (preferably a mix of specialized professionals with domain knowledge and astute non-technical persons with keen interrogation skills and curiosity).

If data, system, and other necessary accesses are possible for a reviewer, there are a baseline of questions that would aid in maintaining proof of a metric’s validity:

  • Are the results reproducible?

  • Is the model transferable?

  • Are the results cross validated?

  • Is proof understandable?

These steps in the process of validation ensure that a model mobilizes metadata production and utilization, keeping data systems and those that leverage them in a persistent state of constructive flow. This flow makes the cumulative and iterative cycle of metadata generation a systematic part of the data operations culture for the organization and its many talented professionals.

Metadata Methodology

Appendix B. Content Management

Templates

These are questions associated with content management for a data asset. It is highly recommended that you collect all essential elements if possible in the course of content management practices.

  • Who are the managers that direct content management?

  • Who are the personnel associated with the data project or content management artifact?

  • Who is the legal counsel that advises on the limits of the content management system?

  • Who is the security point of contact for the content management system?

  • What is the name of the data asset project?

  • What legal restrictions exist for the data project?

  • What security restrictions exist for the data project?

  • Which templates are associated with the project?

  • Which rubrics are associated with the project?

  • What tags are associated with this project?

  • What are the access policies and groups associated with this project?

  • How is version control used in management of this project’s content?

  • How is task management used in the management of this project’s operations and personnel?

  • What are the operational tasks associated with the project’s content management?

  • What is the deprecation plan for the project’s content?

  • Where are the locations of the content documentation?

  • Where are the locations of the content’s associated assets?

  • What is the timeframe for project content review and revision?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the content management document.

field_name

category

definition

benchmark

metrics

cm_manager

content_management

manager that directs content management documentation and personnel

essential

[‘weight’:1]

cm_personnel

content_management

personnel associated with curation and maintenance of content management assets

essential

[‘weight’:1]

cm_legal

content_management

legal advisor for content management practices

essential

[‘weight’:1]

cm_security

content_management

security advisor for content management practices

essential

[‘weight’:1]

cm_name

content_management

name of the data project

essential

[‘weight’:1]

cm_legal_res

content_management

list of legal restrictions for data content

essential

[‘weight’:1]

cm_security

content_management

list of security restrictions for data content

essential

[‘weight’:1]

cm_templates

content_management

list of associated templates

essential

[‘weight’:1]

cm_rubrics

content_management

list of associated rubrics

essential

[‘weight’:1]

cm_tags

content_management

list of tags associated with the data project

essential

[‘weight’:1]

cm_access

content_management

list of policies and groups associated with data project

essential

[‘weight’:1]

cm_vcs

content_management

description and list of version control measures

essential

[‘weight’:1]

cm_tms

content_management

description and list of task management measures

essential

[‘weight’:1]

cm_ops

content_management

description and list of operations tasks associated with data project

essential

[‘weight’:1]

cm_deprecation

content_management

description of deprecation measures

essential

[‘weight’:1]

cm_doc_locs

content_management

list of documentation locations for the data project

essential

[‘weight’:1]

cm_asset_locs

content_management

list of asset locations for the data project

essential

[‘weight’:1]

cm_review

content_management

description of timeline for content review and revision

essential

[‘weight’:1]

Appendix C. Requirements

Templates

These are questions associated with requirements gathering for a data asset. It is highly recommended that you collect all essential elements if possible in the course of requirements gathering practices.

  • What system are required to accomplish the data project?

  • Who are the data owners for the data assets?

  • From where do the data assets originate?

  • Who and/or what produced the data asset?

  • When was the data asset acquired by the owner?

  • When will the data be obsolete or required to be deprecated?

  • When was the requirements workshop conducted (list all dates)?

  • When was the official request for information published?

  • What is the task management plan for the data assets?

  • What is the version control plan for the data assets?

  • What is the conceptual model for the data asset?

  • What is the logical model for the data asset?

  • What is the canonical model for the data asset including the final schema?

  • What is the revision plan for the data asset requirements?

  • What addendum were added to the official requirements (if any)?

  • What is the cancellation plan for the data asset requirements?

Request for Engineering

The request for engineering is a short form to capture essential elements of the requirements workshop. This can be expanded and adapted to suite the needs of the particular workshop and its associated data assets.

Name

Type

Description

timestamp

Timestamp

Datetime for request. Each adjustment (row) to the short form for a data product and/or service within a specified project will have a unique timestamp.

requester

String

Name(s) of those initiating the short form requirements.

requester_info

String

Contact information for the requester.

customer

String

Name(s) of those receiving the deliverables of a data product and/or service.

customer_info

String

Contact information for the customer.

project

String

Name of the project from which the requirements originated.

deliverable

String

A list of products and services requested by the customer.

data_source

String

A list of raw data resources associated with requirements.

data_poc

String

Name(s) of those originating and/or curating the data sources.

data_poc_info

String

Contact information for the data point of contact.

schema

String

The schema and data dictionary for the raw data.

data_format

String

The format(s) in which the raw data is stored and/or delivered.

data_type

String

The raw data types and their requested transformations.

metadata

String

A list of key-value pairs for all raw data metadata.

endpoint

String

The requested product and/or service delivery specifications.

ux_ui

String

Optional field for user experience or user interface specifications.

security

String

Optional field for data security practices.

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

req_systems

requirements

list of the systems and sub-system resources required for the data asset project

essential

[‘weight’:1]

req_owner

requirements

list of the names of the originating data owners and their associated organizational information

essential

[‘weight’:1]

req_origin

requirements

list of the origins of the data assets with any lineage or provenance details

essential

[‘weight’:1]

req_data_production

requirements

details pertaining to the means and methods used to produce the data assets

essential

[‘weight’:1]

req_acquisition

requirements

dates and methods by which the data asset was acquired by the owner

essential

[‘weight’:1]

req_deprecation

requirements

details on the end use date of the data asset with procedural information

essential

[‘weight’:1]

req_workshop_con

requirements

the dates of the requirements workshop

essential

[‘weight’:1]

req_rfe

requirements

the date the official requirements were published

essential

[‘weight’:1]

req_tms

requirements

description of the task management plan and setup for the data project

essential

[‘weight’:1]

req_vcs

requirements

description of the version control plan and setup for the data project

essential

[‘weight’:1]

req_conceptual_model

requirements

a high-level, real world customer data model concept(s) including the scheme and schema

essential

[‘weight’:1]

req_logical_model

requirements

a detailed description of connections between data fields and conceptual model(s) including scheme and schema

essential

[‘weight’:1]

req_canonical_model

requirements

the final scheme and schema for the data assets: often the same as mod_schema

essential

[‘weight’:1]

req_rev

requirements

description of the revision and review plan for the requirements and addendum

essential

[‘weight’:1]

req_addendum

requirements

a list and description of the addendum for the official published requirements: using the RFE template field to demonstrate changes

essential

[‘weight’:1]

req_cancel

requirements

description of cancellation plan for redacted or terminated requirements

essential

[‘weight’:1]

Appendix D. Task Management

Templates

These are questions associated with task management for a data asset. It is highly recommended that you collect all essential elements if possible in the course of task management practices.

  • Who is the manager for the task management process?

  • Who is the product owner who will oversee and receive the data asset?

  • Who is the scrum master?

  • Who are the scrum team personnel?

  • Who are the other stakeholders involved in the data project tasks?

  • Who are the other vendors involved in the data project tasks?

  • What is the project name?

  • What is the structure of the TMS boards?

  • What are the structures of the boards’ cards?

  • What are the labels used in the TMS process?

  • What is the timeline for the project?

  • How is VCS of TMS artifacts handled for the project?

  • What are the critical metrics for the project?

  • What is the review plans for the TMS process?

  • What are the elements of the completed MVP?

  • What are the outcomes of the sprint retrospectives?

  • Where are the TMS artifacts stored for the project?

Data Team Board Setup

The name of the board should be Project Name - Data Team where Project Name is the name of the project from which the requirements originated. The list names and descriptions are as follows:

Name

Description

Backlog

The list of task and subtasks derived from the project’s business requirements.

Sprint Backlog

The selected and prioritized list of tasks and subtasks scope for completion during a sprint.

In Progress

Assigned sprint tasks in the process of being completed for a sprint.

Sprint dd.mm.yy - x Weeks

Completed sprint tasks for a given sprint.

There are special cards included by default with this template:

Card

List

Description

Product Owner: Name

Backlog

The contact information and additional information for the product owner.

Scrum Master: Name

Backlog

The contact information and additional information for the project scrum master.

Product Goal

Backlog

The product goal statement used to scope the tasks in the backlog. Included a Definition of Done for the product.

Sprint Goal

Sprint dd.mm.yy - x Weeks

The sprint goal statement used to scope the tasks in the sprint backlog.

To assist the product owner and scrum master, colored label should also be used to signal status in a sprint.

Label Color

Description

Green

On Track

Yellow

Testing

Orange

Quality Control

Red

Blocker

Black

Done

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

tms_manager

task_management

managers involved in the TMS processes

essential

[‘weight’:1]

tms_owner

task_management

asset owners involved in the TMS processes

essential

[‘weight’:1]

tms_scrum_master

task_management

scrum masters coordinating tasks and sprints

essential

[‘weight’:1]

tms_scrum_team

task_management

team executing setup, sprint, and MVP tasks

essential

[‘weight’:1]

tms_stakeholder

task_management

persons with responsibilities for TMS outcomes

essential

[‘weight’:1]

tms_vendor

task_management

vendors facilitating tasks within TMS processes

essential

[‘weight’:1]

tms_project_name

task_management

name of the TMS projects for data assets

essential

[‘weight’:1]

tms_board

task_management

description of the board setup with changes for a data project’s TMS processes

essential

[‘weight’:1]

tms_card

task_management

description of the card setup with changes for a data project’s TMS processes

essential

[‘weight’:1]

tms_label

task_management

description of the labels used for a data project’s TMS processes

essential

[‘weight’:1]

tms_timeline

task_management

description of the timeline to deadline including key review dates

essential

[‘weight’:1]

tms_metrics

task_management

description of key metrics and models including burndown and velocity chart information

essential

[‘weight’:1]

tms_rev

task_management

description of review plan for TMS processes including extensions and timeline shifts

essential

[‘weight’:1]

tms_vcs

task_management

description of VCS plan for TMS assets and artifacts

essential

[‘weight’:1]

tms_mvp

task_management

description of complete MVP to definition of done standards including any changes

essential

[‘weight’:1]

tms_retrospective

task_management

description of retrospective outcomes for each sprint

essential

[‘weight’:1]

tms_artifacts

task_management

description of artifact locations and details on associated metrics

essential

[‘weight’:1]

Appendix E. Version Control

Templates

These are questions associated with version control for a data asset. It is highly recommended that you collect all essential elements if possible in the course of version control practices.

  • Who owns the repositories associated with the data project?

  • Who has access to the repositories?

  • Who are the reviews of repository activities?

  • Who are the contributors to each repository?

  • What vendors are used to facilitate VCS tasks?

  • Where are VCS tasks being conducted?

  • What is the project name?

  • What are the repository names?

  • What workflow model and schemes were used?

  • What tags were applied to each repository?

  • How is review of each repository handled?

  • Where are artifacts associated with the repositories stored and managed?

  • Where are the documents associated with the repository and source code stored and managed?

  • What migration are associated with this repository?

GitFlow Process

There are several things to consider when setting up a repository for source code and documentation. These are a few of those consideration to establish ahead of linking VCS to TMS and other data operations practices:

  • What is the repository naming convention?

    • Is there a project digraph + tool name convention or something else?

  • Do you have a standard Readme template to capture essential elements of information?

    • Title

    • Description

    • Installation instructions

    • Usage instructions and limitations

  • Do you have a docs directory convention?

    • Are you using Sphinx or another static site generator?

  • Is there a Wiki in or connected to the repository?

  • How are source code and documentation issues tracked?

  • What license model is being used for the source code?

  • Are there contribution guidelines for members of the project?

    • Is there instructions for member on workflow practices?

    • Is peer or mob coding practices used? If so, how?

Repository Template

The goal of using a standard repository template is maximum reuse of well-documented code and methods.

An example of this template is as follows (from the top of the repository):

  • .gitignore

  • README.md

    • Links/locations to CMS on information system(s)

  • LICENSE.md

  • CONTRIBUTING.md

  • SECURITY.md

  • data_ops

    • README.md

    • CONTRIBUTING.md

    • docs

      • requirements

        • README.md

      • task_management

        • README.md

      • methods_research

        • README.md

      • quality_assurance

        • README.md

      • images

        • README.md

    • data_management

      • storage README.md

        • pipelines

      • modeling README.md

      • analysis README.md

      • integration README.md

For each DMS section associated with code or tool methods, source folder content should maintain this baseline:

  • models

    • model_a

      • README.md

      • source

      • tests

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

vcs_owner

version_control

list all persons with admin access to each repository

essential

[‘weight’:1]

vcs_member

version_control

list all persons with general access to each repository

essential

[‘weight’:1]

vcs_reviewer

version_control

list all persons responsible for source code review

essential

[‘weight’:1]

vcs_contributor

version_control

list all persons who modified or added to source code

essential

[‘weight’:1]

vcs_vendor

version_control

list all persons or groups who provide services to aid in VCS tasks (e.g., CI/CD)

essential

[‘weight’:1]

vcs_system

version_control

list all systems on which VCS tasks are conducted

essential

[‘weight’:1]

vcs_project

version_control

name of the associated data project

essential

[‘weight’:1]

vcs_license

version_control

name of license associated with source code and documentation

essential

[‘weight’:1]

vcs_repo

version_control

list of all repo names and branches for a data project

essential

[‘weight’:1]

vcs_workflow

version_control

description of VCS scheme and coding practice rules (e.g., GitFlow)

essential

[‘weight’:1]

vcs_tags

version_control

list of tags applied to the repositories associate with a data project and broader data operations schemas

essential

[‘weight’:1]

vcs_rev

version_control

description of the review practices for repositories

essential

[‘weight’:1]

vcs_artifacts

version_control

list of all data connections and other essential elements associated with each repository

essential

[‘weight’:1]

vcs_docs

version_control

list of locations for the repositories’ documentation sources (e.g., readthedocs, wikis)

essential

[‘weight’:1]

vcs_migrate

version_control

list of migrations to other repositories from the data project’s collection with descriptions of rationale for migration

essential

[‘weight’:1]

Appendix F. Quality Assurance

Templates

These are questions associated with quality assurance for a data asset. It is highly recommended that you collect all essential elements if possible in the course of quality assurance practices.

  • Who are the managers for quality assurance of the data assets?

  • Who are the quality assurance team members working on the data assets?

  • Who are the data governance board members governing the data assets?

  • What is the name of the data project?

  • What are the accuracy standards for the data assets?

  • What are the completeness standards for the data assets?

  • What are the integrity standards for the data assets?

  • What are the reasonability standards for the data assets?

  • What are the timeliness standards for the data assets?

  • What are the uniqueness standards for the data assets?

  • What are the validity standards for the data assets?

  • What are the quality assurance rules associated with the data assets?

  • How are issues handles for the data assets?

  • What are the profile models for the data assets?

  • What are the quality assurance assessment outcomes for the data assets?

  • What are the key metrics for the data assets?

  • What is the strategy for the data assets?

  • How is reporting handled for the data assets?

  • Where are the quality assurance documents stores for the data assets?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

qa_manager

quality_assurance

list of managers associated with quality assurance for the data assets

essential

[‘weight’:1]

qa_team

quality_assurance

list of team members associated with quality assurance tasks for the data assets including roles

essential

[‘weight’:1]

qa_board

quality_assurance

list of data governance board members associated with guidance on data asset quality assurance tasks

essential

[‘weight’:1]

qa_name

quality_assurance

name of the data project

essential

[‘weight’:1]

qa_accuracy

quality_assurance

description of accuracy standards associated with the data assets

essential

[‘weight’:1]

qa_completeness

quality_assurance

description of completeness standards associated with the data assets

essential

[‘weight’:1]

qa_integrity

quality_assurance

description of integrity standards associated with the data assets

essential

[‘weight’:1]

qa_reasonability

quality_assurance

description of reasonability standards associated with the data assets

essential

[‘weight’:1]

qa_timeliness

quality_assurance

description of timeliness standards associated with the data assets

essential

[‘weight’:1]

qa_uniqueness

quality_assurance

description of uniqueness standards associated with the data assets

essential

[‘weight’:1]

qa_validity

quality_assurance

description of validity standards associated with the data assets

essential

[‘weight’:1]

qa_rules

quality_assurance

description of quality assurance rules associated with the data assets

essential

[‘weight’:1]

qa_issues

quality_assurance

description of the methodologies used to handle quality assurance issues

essential

[‘weight’:1]

qa_profile

quality_assurance

description of the enhancements, analysis models, and outcomes associates with the quality assurance profile for each data asset

essential

[‘weight’:1]

qa_assessment

quality_assurance

list of categories for each data asset with rationale

essential

[‘weight’:1]

qa_metrics

quality_assurance

list of metrics for each data asset with descriptions of impacts and actions taken on quality assurance metrics for data assets

essential

[‘weight’:1]

qa_strategy

quality_assurance

description of the data governance strategy for the data assets with location information for the strategy documentation

essential

[‘weight’:1]

qa_report

quality_assurance

list of locations for logs and other forms of reporting for the data assets

essential

[‘weight’:1]

qa_docs

quality_assurance

list of location for documentation associated with quality assurance for general guidance and data asset specific information

essential

[‘weight’:1]

Appendix G. Security

Templates

These are questions associated with security for a data asset. It is highly recommended that you collect all essential elements if possible in the course of security practices.

  • Who are the security managers associated with the data project?

  • Who are the owners of the data assets?

  • Who are the stewards for the data assets?

  • Who are the information security managers associated with the data project?

  • Who are the information assurance personnel associate with the data project?

  • Who are the data governance board members associated with the data project?

  • Who are the security team members associated with data project security tasks?

  • What is the name of the data project?

  • What are the tags associated with the data assets?

  • How is access handled for the data assets and the data project?

  • What are the security policies that apply to the data project and assets?

  • What privacy measures are required for the data project and data assets?

  • What are the authentication protocols in place for the data assets?

  • What additional guidelines exist for the data assets?

  • What are the key security metrics for the data assets?

  • How is training handled for secure access to data project assets?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

sec_manager

security

list of managers associated with data project security

essential

[‘weight’:1]

sec_owner

security

list of owners associated with the data assets

essential

[‘weight’:1]

sec_steward

security

list of stewards responsible for tracking data asset security within the data project

essential

[‘weight’:1]

sec_infosec_manager

security

list of institutional managers associated with data system security

essential

[‘weight’:1]

sec_info_assurance

security

list of institutional personnel associated with data policy and guideline application to data project assets

essential

[‘weight’:1]

sec_board

security

list of data governance board members associated with the data project security

essential

[‘weight’:1]

sec_team

security

list of security team members responsible for data project security tasks including roles

essential

[‘weight’:1]

sec_name

security

name of the data project

essential

[‘weight’:1]

sec_tags

security

list of tags associated with the data project and data assets including access levels, security groups, and required credentials

essential

[‘weight’:1]

sec_access

security

description of the group policy put in place for a data project and data assets

essential

[‘weight’:1]

sec_policy

security

description of security regulation in place for the data project

essential

[‘weight’:1]

sec_privacy

security

description of security features for data assets including masking, synthetic data, or other security models

essential

[‘weight’:1]

sec_authentication

security

description of authentication practice required for the data project and data assets including required logs, metrics, and audits

essential

[‘weight’:1]

sec_guidelines

security

description of specific guidance for each data assets with security action plan

essential

[‘weight’:1]

sec_metrics

security

list of key security metrics for the data project and the data assets

essential

[‘weight’:1]

sec_training

security

description of training requirement in place for the data project, data assets, and broader institutional certification requirements

essential

[‘weight’:1]

Appendix H. Ethics

Templates

These are questions associated with ethics for a data asset. It is highly recommended that you collect all essential elements if possible in the course of ethics practices.

  • Who are the managers associated with data asset ethics?

  • Who are the data asset owners?

  • Who are the data asset stewards?

  • Who are the team members associated with ethical review for data asset utilization?

  • Who are the data asset stakeholders?

  • Who are the IRB team members associated with the data project?

  • Who are the agencies associated with ethical practices for the data project?

  • What is the code of conduct associated with ethical data practices?

  • What was the consent process for the data assets and data project?

  • Are there certification associated with data project ethics?

  • What ethics policies are associated with the data project?

  • What is the name of the data project?

  • How was ethical acquisition of the data assets handled for the data project?

  • How was ethical storage of the data assets handled for the data project?

  • How was ethical processing of the data assets handled for the data project?

  • How was ethical monitoring of the data assets handled for the data project?

  • What are the impacts or potential impacts associated with ethical challenges for the data project?

  • What metrics are leverage to aid in ethical standards for data assets and the data project?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

eth_manager

ethics

list of managers associated with ethics for data project

essential

[‘weight’:1]

eth_owner

ethics

list of data owners associated with ethics for data project

essential

[‘weight’:1]

eth_steward

ethics

list of data stewards associated with ethics for data project

essential

[‘weight’:1]

eth_team

ethics

names and roles of persons associated with ethics for data project

essential

[‘weight’:1]

eth_stakeholder

ethics

list of stakeholders associated with ethics for data project

essential

[‘weight’:1]

eth_irb

ethics

institutional review boards associated with the data assets or data project

essential

[‘weight’:1]

eth_agency

ethics

list of agencies associated with ethics for data project

essential

[‘weight’:1]

eth_conduct

ethics

codes of conduct used to govern ethics for the data project

essential

[‘weight’:1]

eth_consent

ethics

consent practices associated with data assets and the data project

essential

[‘weight’:1]

eth_certification

ethics

list of certifications required for ethical data practices for data project

essential

[‘weight’:1]

eth_policy

ethics

policies guiding ethical practices for data project

essential

[‘weight’:1]

eth_name

ethics

names of the data assets

essential

[‘weight’:1]

eth_acquisition

ethics

description of methods of ethical data acquisition used for each data asset

essential

[‘weight’:1]

eth_storage

ethics

description of methods of ethical data storage used for each data asset access

essential

[‘weight’:1]

eth_process

ethics

description of methods of ethical data processing used for each data asset delivery

essential

[‘weight’:1]

eth_monitor

ethics

description of methods of ethical data monitoring used for each data asset and their deprecation

essential

[‘weight’:1]

eth_impact

ethics

description of impacts associated with ethical outcomes

essential

[‘weight’:1]

eth_metrics

ethics

list of key metrics for measuring ethical standards used for each data asset and the data projects

essential

[‘weight’:1]

Appendix I. Storage

Templates

These are questions associated with storage of data. It is highly recommended that you collect all essential elements if possible in the course of data management practices for data storage.

  • Who is the manager that directs data storage practices and personnel?

  • Who is the data manager that assists the primary data manager (if exists)?

  • Who is the manager that runs a project for which a raw data asset is required?

  • Who is the legal counsel that advises on the legal limits for use of a raw data asset?

  • Who is the project manager that assists the primary project manager

  • Who is on the team that supports the data manager and/or deputy data manager?

  • Who is on the team that supports the project manager and/or project lead?

  • Who is the engineer that supports the data management team?

  • Who is the owner that produced the raw data asset?

  • Who is the steward that moves the raw data asset to the data manager?

  • Who is the security agent that is in charge of tracking loading of and access to raw data assets?

  • What is the name of the stored raw data asset?

  • What type of raw data asset (e.g., reference, critical, project)?

  • Where is the requirements reference for a raw data asset?

  • Is there a form completed for the loading of a raw data asset?

  • Is there a form completed for accessing to a raw data asset?

  • How is the data team handling task management for raw data asset tasks?

  • How is the data team handling version control for data loading methods and code?

  • Was data mining used for sourcing the raw data? What methods were used for the mined data?

  • What pipeline models, tools and/or code for processing raw data into storage are used?

  • Where is the raw data stored?

  • Where is the master data derived from the raw data stored?

  • Are there any addendum or changes to the raw data requirements?If so, please describe in detail with references.

  • What is the monitoring plan for the raw data (including retention and deprecation plan)?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

sto_data_manager

storage

manager that directs data storage practices and personnel

essential

[‘weight’:1]

sto_deputy_data_manager

storage

data manager that assists the primary data manager (if exists)

optional

[‘weight’:1]

sto_project_manager

storage

manager that runs a project for which a raw data asset is required

essential

[‘weight’:1]

sto_legal_counsel

storage

counsel that advises on the legal limits for use of a raw data asset

optional

[‘weight’:1]

sto_project_lead

storage

project manager that assists the primary project manager

optional

[‘weight’:1]

sto_data_management_team

storage

team that supports the data manager and/or deputy data manager

essential

[‘weight’:1]

sto_project_team

storage

team that supports the project manager and/or project lead

optional

[‘weight’:1]

sto_data_engineer

storage

engineer that supports the data management team

essential

[‘weight’:1]

sto_data_owner

storage

owner that produced the raw data asset

essential

[‘weight’:1]

sto_data_steward

storage

steward that moves the raw data asset to the data manager

essential

[‘weight’:1]

sto_tta

storage

security agent that is in charge of tracking loading of and access to raw data assets

essential

[‘weight’:1]

sto_data_asset_name

storage

name of the stored raw data asset

essential

[‘weight’:1]

sto_type

storage

type of raw data asset (e.g., reference, critical, project)

essential

[‘weight’:1]

sto_requirements

storage

requirements reference for a raw data asset

essential

[‘weight’:1]

sto_load_form

storage

form reference for the loading of a raw data asset

essential

[‘weight’:1]

sto_access_form

storage

form reference for the access to a raw data asset

essential

[‘weight’:1]

sto_task_management

storage

task management reference for data management team’s raw data asset tasks

essential

[‘weight’:1]

sto_version_control

storage

version control reference for data engineer(s) data loading methods and code

essential

[‘weight’:1]

sto_data_mining

storage

data mining references for sourcing and methods of mined data

recommended if data mining conducted, else optional

[‘weight’:1]

sto_pipelines

storage

pipeline references for models, tools and/or code for processing raw data into storage

essential

[‘weight’:1]

sto_raw_data_location

storage

location reference(s) of raw data storage

essential

[‘weight’:1]

sto_master_data_location

storage

location reference(s) of master data storage

essential

[‘weight’:1]

sto_addendum

storage

form reference for the change in storage to a raw data asset

recommended if addendum generated, else optional

[‘weight’:1]

sto_monitoring

storage

reference for the retention and deprecation to all data assets

essential

[‘weight’:1]

Appendix J. Modeling

Templates

These are questions associated with modeling of data. It is highly recommended that you collect all essential elements if possible in the course of data management practices for data modeling.

  • Who is the data manager that directs data modeling practices and personnel

  • Who is the data manager that assists the primary data manager with modeling(if exists)?

  • Who is on the team that supports the data manager and/or deputy data manager with modeling?

  • Who is the engineer(s) that support the data management team with modeling?

  • Who are the data personnel that produced the modeled data asset(s)?

  • Who is the data steward that moves the raw data asset to and from data management team modeling?

  • How is task management handled for the data modeling tasks?

  • How is version control handled for the data modeling methods and code?

  • What is the scheme(s) for the final data model?

  • What is the schema(s) for the final data model (including data types)?

  • What methods derived from methodologies were used in the modeling of the data asset?

  • What is the overall model approach?

  • Who are the reviewer(s) of the final model elements including scheme, schema, methods, and other essential features?

  • What are the tags for the model’s associated categories?

  • What are the ontological entities that are linked to the modeled data features?

  • What is the dictionary reference for the modeled data schema features (e.g., data dictionary)?

  • How was the data transformed for the final data model?

  • How was the data normalized for the final data model?

  • What is the sampling method suggested for the data asset?

  • Was synthetic data recommended for use of the data asset? If so, what is the method used?

  • Was entity resolution used for the data asset model? If so, what is the method used?

  • What challenges existed with the final model? Is there any additional context that should be noted about the modeled data?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

mod_data_manager

modeling

manager that directs data modeling practices and personnel

essential

[‘weight’:1]

mod_deputy_data_manager

modeling

data manager that assists the primary data manager with modeling(if exists)

optional

[‘weight’:1]

mod_data_management_team

modeling

team that supports the data manager and/or deputy data manager with modeling

essential

[‘weight’:1]

mod_data_engineer

modeling

engineer(s) that support the data management team with modeling

essential

[‘weight’:1]

mod_data_owner

modeling

data personnel that produced the modeled data asset(s)

essential

[‘weight’:1]

mod_data_steward

modeling

steward that moves the raw data asset to and from data management team modeling

essential

[‘weight’:1]

mod_task_management

modeling

task management reference for data management team’s data modeling tasks

essential

[‘weight’:1]

mod_version_control

modeling

version control reference for data engineer(s) data modeling methods and code

essential

[‘weight’:1]

mod_scheme

modeling

scheme(s) for the final data model

essential

[‘weight’:1]

mod_schema

modeling

schema(s) for the final data model including data types

essential

[‘weight’:1]

mod_methods

modeling

methods derived from methodologies used in the modeling of the data asset

essential

[‘weight’:1]

mod_description

modeling

description that outlines the overall model approach

essential

[‘weight’:1]

mod_reviewers

modeling

reviewer(s) of the final model elements including scheme, schema, methods, and other essential features

essential

[‘weight’:1]

mod_tags

modeling

tags for the model’s associated categories

essential

[‘weight’:1]

mod_entities

modeling

entities that are linked to the modeled data features

essential

[‘weight’:1]

mod_dictionary

modeling

dictionary reference for the modeled data schema features

essential

[‘weight’:1]

mod_transformation

modeling

transformation details for the final data model

essential

[‘weight’:1]

mod_normalization

modeling

normalization details for the final data model

essential

[‘weight’:1]

mod_sampling

modeling

sampling method suggested for the data asset, used primarily in analytics and the data catalog

essential

[‘weight’:1]

mod_synthetic

modeling

synthetic data reference used for a raw data asset

optional

[‘weight’:1]

mod_entity_res

modeling

entity resolution reference used for a data asset model

optional

[‘weight’:1]

mod_challenge

modeling

challenges that existed with the final model

essential

[‘weight’:1]

Appendix K. Analytics

Templates

These are questions associated with data analytics. It is highly recommended that you collect all essential elements if possible in the course of data management practices for data asset analytics.

  • Who is the manager that directs data management analytics practices and personnel?

  • Who is the data manager that assists the primary data manager with analytics(if exists)?

  • Who is the team that supports the data manager and/or deputy data manager with analytics?

  • Who is the engineer(s) that support the data management team with analytics?

  • Who are the data personnel that retain the modeled data asset(s) for analytics?

  • Who is the steward that moves the modeled data asset to and from data management team analytics?

  • How is task management handled for data analytics tasks?

  • How is version control handled for data analytics methods and code?

  • What are the accuracy metrics of the modeled data asset(s)?

  • What are the completeness metrics of the modeled data asset(s)?

  • What are the consistency metrics of the modeled data asset(s)?

  • What are the integrity metrics of the modeled data asset(s)?

  • What are the reasonability metrics of the modeled data asset(s)?

  • What are the timeliness metrics of the modeled data asset(s)?

  • What are the uniqueness metrics of the modeled data asset(s)?

  • What are the validity metrics of the modeled data asset(s)?

  • What is the size of the modeled data asset(s) when stored?

  • What is the size of the raw data asset(s) when stored?

  • What is the shape of the modeled data asset(s)?

  • What is the shape of the raw data asset(s)?

  • What is the statistical profile of the modeled data asset(s) features including nulls, value ranges, data types, and frequency distributions?

  • What is the statistical profile of the raw data asset(s) features including nulls, value ranges, data types, and frequency distributions?

  • What is the format of the modeled data going through analytics processes?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

ana_data_manager

analytics

manager that directs data analytics practices and personnel

essential

[‘weight’:1]

ana_deputy_data_manager

analytics

data manager that assists the primary data manager with analytics(if exists)

optional

[‘weight’:1]

ana_data_management_team

analytics

team that supports the data manager and/or deputy data manager with analytics

essential

[‘weight’:1]

ana_data_engineer

analytics

engineer(s) that support the data management team with analytics

essential

[‘weight’:1]

ana_data_owner

analytics

data personnel that retains the modeled data asset(s) for analytics

essential

[‘weight’:1]

ana_data_steward

analytics

steward that moves the modeled data asset to and from data management team analytics

essential

[‘weight’:1]

ana_task_management

analytics

task management reference for data management team’s data analytics tasks

essential

[‘weight’:1]

ana_version_control

analytics

version control reference for data engineer(s) data analytics methods and code

essential

[‘weight’:1]

ana_accuracy

analytics

accuracy metrics of the modeled data asset(s)

essential

[‘weight’:1]

ana_completeness

analytics

completeness metrics of the modeled data asset(s)

essential

[‘weight’:1]

ana_consistency

analytics

consistency metrics of the modeled data asset(s)

essential

[‘weight’:1]

ana_integrity

analytics

integrity metrics of the modeled data asset(s)

essential

[‘weight’:1]

ana_reasonability

analytics

reasonability metrics of the modeled data asset(s)

essential

[‘weight’:1]

ana_timeliness

analytics

timeliness metrics of the modeled data asset(s)

essential

[‘weight’:1]

ana_uniqueness

analytics

uniqueness metrics of the modeled data asset(s)

essential

[‘weight’:1]

ana_validity

analytics

validity metrics of the modeled data asset(s)

essential

[‘weight’:1]

ana_data_size_model

analytics

size of the modeled data asset(s) when stored

essential

[‘weight’:1]

ana_data_size_raw

analytics

size of the raw data asset(s) when stored

essential

[‘weight’:1]

ana_data_shape_model

analytics

shape of the modeled data asset(s)

essential

[‘weight’:1]

ana_data_shape_raw

analytics

shape of the raw data asset(s)

essential

[‘weight’:1]

ana_descriptive_statistics_model

analytics

statistical profile of the modeled data asset(s) features including nulls, value ranges, data types, and frequency distributions

essential

[‘weight’:1]

ana_descriptive_statistics_raw

analytics

statistical profile of the raw data asset(s) features including nulls, value ranges, data types, and frequency distributions

essential

[‘weight’:1]

ana_format

analytics

the format of the modeled data going through analytics processes

essential

[‘weight’:1]

Appendix L. Integration

Templates

These are questions associated with data integration. It is highly recommended that you collect all essential elements if possible in the course of data management practices for data integration.

  • Who is the manager that directs data integration practices and personnel?

  • Who is the data manager that assists the primary data integration manager (if exists)?

  • Who is the project manager overseeing integration of the modeled data?

  • Who is the counsel that advises on the legal limits for integration of a modeled data asset?

  • Who is the project lead in receiving the data integration?

  • Who is the on the team that supports the data integration manager and/or deputy data manager?

  • Who is the project team in charge of receiving the data integration?

  • Who is the engineer that supports the data integration management team?

  • Who is the owner that produced or controls the modeled data asset?

  • Who is the steward that moves the modeled data integration to the data manager?

  • Who is the security agent that is in charge of tracking loading of and access to modeled data assets and integrations?

  • What is the overall plan for the integration endpoint deployment and delivery?

  • What is the assessment of proposed final integration behaviors and outputs including format, quality, documentation, metadata lineage updates, data asset provenance, analytic metrics, and policy compliance?

  • What are the resources and personnel tasked to implement the integration endpoint design?

  • What database(s) were used for the integration endpoint solution?

  • What are the API details for the integration endpoint solution?

  • What are the development details for the integration endpoint solution including testing and logging information?

  • What are the date(s) and time(s) the integration endpoint was first accessed by the customer?

  • What are the date(s) and time(s) the integration endpoint ceased to be available to the customer?

  • Describe of how the data asset is used by the customer from the endpoint?

  • Describe the monitoring details, metrics, and actions to be conducted for a deployed endpoint until end of lifecycle?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

int_data_manager

integration

manager that directs data integration practices and personnel

essential

[‘weight’:1]

int_deputy_data_manager

integration

data manager that assists the primary data integration manager (if exists)

optional

[‘weight’:1]

int_project_manager

integration

the manager in charge of receiving the integration (possibly same as sto_project_manager)

essential

[‘weight’:1]

int_legal_counsel

integration

counsel that advises on the legal limits for integration of a modeled data asset

optional

[‘weight’:1]

int_project_lead

integration

The lead in charge of guiding implementation of the integration (possibly same as sto_project_lead)

optional

[‘weight’:1]

int_data_management_team

integration

team that supports the data integration manager and/or deputy data manager

essential

[‘weight’:1]

int_project_team

integration

the team in charge of implementing the integration (possibly same as sto_project_team)

optional

[‘weight’:1]

int_data_engineer

integration

engineer that supports the data integration management team

essential

[‘weight’:1]

int_data_owner

integration

owner that produced or controls the modeled data asset

essential

[‘weight’:1]

int_data_steward

integration

steward that moves the modeled data integration to the data manager

essential

[‘weight’:1]

int_tta

integration

security agent that is in charge of tracking loading of and access to modeled data assets and integrations

essential

[‘weight’:1]

int_plan

integration

plan description for the integration endpoint deployment and delivery

essential

[‘weight’:1]

int_assessment

integration

assessment of proposed final integration behaviors and outputs including format, quality, documentation, metadata lineage updates, data asset provenance, analytic metrics, and policy compliance

essential

[‘weight’:1]

int_design

integration

description of resources and personnel tasked to implement the integration endpoint design

essential

[‘weight’:1]

int_database

integration

database details for the integration endpoint solution

recommended if database used, else optional

[‘weight’:1]

int_api

integration

api details for the integration endpoint solution

recommended if api used, else optional

[‘weight’:1]

int_development

integration

development details for the integration endpoint solution including testing and logging information

essential

[‘weight’:1]

int_endpoint_implemented

integration

date and time the integration endpoint was first accessed by the customer (this may be several datetimes, each needing an explanation)

essential

[‘weight’:1]

int_endpoint_deprecated

integration

date and time the integration endpoint ceased to be available to the customer (this may be several datetimes, each needing an explanation)

essential

[‘weight’:1]

int_provenance

integration

description of how the data asset is used by the customer from the endpoint

essential

[‘weight’:1]

int_monitoring

integration

description of monitoring details, metrics, and actions to be conducted for a deployed endpoint until end of lifecycle

essential

[‘weight’:1]

Appendix M. Metadata

Templates

These are questions associated with metadata for a data asset. It is highly recommended that you collect all essential elements if possible in the course of data management practices for metadata. This section in particular can draw from all other sections as Appendix A is the metamodel for metadata.

  • What is the scheme for the data asset’s categorical metadata areas?

  • What tags are associated with the data asset?

  • Which personnel contributed to the metadata for a data asset?

  • What was the metadata strategy for the data asset?

  • How is version control managed for the metadata assets?

  • How is the metadata to be maintained?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

met_metamodel

metadata

scheme description for the data asset’s categorical metadata areas

essential

[‘weight’:1]

met_tags

metadata

tags associated with the data asset metadata considerations including type, associated projects, security labels, and associated policy restrictions

essential

[‘weight’:1]

met_editors

metadata

list of personnel contributing to the metadata for a data asset

essential

[‘weight’:1]

met_strategy

metadata

strategy for use of a data asset via metadata element (e.g., knowledge discovery) including migrations

essential

[‘weight’:1]

met_vcs

metadata

link to version control for metadata assets

essential

[‘weight’:1]

met_schedule

metadata

schedule and practice details associated with metadata maintenance including review and enrichment

essential

[‘weight’:1]

Appendix N. Cataloging

Templates

These are questions associated with cataloging of data. It is highly recommended that you collect all essential elements if possible in the course of data management practices for data cataloging.

  • Who manages data cataloging practices and personnel for the data asset?

  • Who assists the primary data catalog manager (if exists)?

  • Who is the counsel that advises on the legal limits for use of a cataloged metadata or master data samples?

  • Who is the team that supports the data manager and/or deputy data manager?

  • Who is the engineer that supports the data catalog management?

  • Who owns the master data sample?

  • Who is the steward that moves the master data sample to the data manager?

  • What annotations are associated with the cataloged data record?

  • What feedback method(s) are available for the data sample and associated metadata entry?

  • What is the monitoring plan for the data catalog?

Rubrics

The ciuTshi metamodel for metadata contains a set of baseline criteria. This can be adjusted based on the specific language or model metadata requirements. Benchmarks and metrics are flexible elements that can guide and enrich the metadata model for the institution’s specific metadata needs.

  • Benchmark is the expected suitability measure or criteria for the metadata element.

    • essential elements are metric elements of information needed to ensure data retained in done so for measurable reason(s).

    • non-essential elements are elements that may not be relevant to the raw data asset of the institution in charge of the data asset.

    • recommended elements are recommended in cases where the raw data asset has set conditions upon it utilization or complexities in its interpretation.

  • Metrics is an extensible array of quantitative and qualitative features associated with the data asset element and can be augmented to suit an institutions metric requirements. weight is the only default feature in metrics

    • weight by default is set to 1 for each metadata element.

For more information, refer to the Metadata document.

field_name

category

definition

benchmark

metrics

cat_data_manager

cataloging

manager that directs data cataloging practices and personnel

essential

[‘weight’:1]

cat_deputy_data_manager

cataloging

data manager that assists the primary data manager (if exists)

optional

[‘weight’:1]

cat_legal_counsel

cataloging

counsel that advises on the legal limits for use of a cataloged metadata or master data samples

optional

[‘weight’:1]

cat_data_management_team

cataloging

team that supports the data manager and/or deputy data manager

essential

[‘weight’:1]

cat_data_engineer

cataloging

engineer that supports the data management team

essential

[‘weight’:1]

cat_data_owner

cataloging

owner that produced the master data sample

essential

[‘weight’:1]

cat_data_steward

cataloging

steward that moves the master data sample to the data manager

essential

[‘weight’:1]

cat_annotations

cataloging

annotations on a cataloged data record annotation special access or handling instructions for specific sections of the metadata within and outside of the catalog

essential

[‘weight’:1]

cat_feedback

cataloging

feedback method(s) associated with a data sample for catalog users

essential

[‘weight’:1]

cat_monitoring

cataloging

monitoring plan for the data catalog

essential

[‘weight’:1]