First: Introduction
Proceeding from GASTAT mission to provide updated, value-added, accurate, comprehensive and reliable statistical products and services based on the best international standards and practices; to take the lead in developing the statistical sector in support of decision-taking; and to realize its vision to be the most distinctive and innovative statistical reference in support of socio-economic growth of KSA, the GSBPM represents one of the key enablers to achieve this objective. GSBPM refers to the set of Processes and activities that statistical processes require in order for GASTAT statistical departments to provide high-quality, accurate, and complete statistics based on state-of-art worldwide best practices. The Model also serves as a means of organizing the statistical sector and shaping its operations as specified in Article 4 (2) of Council of Ministers Resolution No. 11 of 13 January 1437 AH on Approving the Organization of the General Authority for Statistics, which states: «GASTAT is the sole authority entrusted with carrying out, supervising and regulating statistical activities. To this end, it shall be vested with the following competencies: Carry out statistical activities in conformity with internationally recognized standards, including: Identify the statistical methodology».
GASTAT, being a member of the worldwide statistical community, incorporated international recommendations and practices into the National Version of GSBPM while keeping in mind its organizational environment.
Second: Terms and Definitions
The following terms, wherever they appear in this GSBPM, shall have the meanings assigned to each of them:
1. Authority: General Authority for Statistics.
2. Statistical Sector: The General Authority for Statistics being the central body responsible for the entire statistical activities. It also comprises the integrated matrix of all departments, units and statistical divisions from all sectors and institutions within the state.
3. The Model: The (National Version) of the Generic Statistical Business Process Model.
4. International Model: Version (5.1) of the Generic Statistical Business Process Model (GSBPM) published by the United Nations Economic Commission for Europe (UNECE) in January 2019.
5. Administrative Records: Statistical-related paper or electronic records in which data or information is documented at all other entities, as well as those related to community conditions and activities at large.
Third: Glance
GASTAT establish this version to be a national version that is compatible with the GSBPM (5.1) issued in January 2019 by the United Nations Economic Commission for Europe (UNECE).
This national version is expected to witness further updates over the coming years to cope with the developments and models introduced by the High-Level Group for the Modernization of Official Statistics (HLG-MOS) such as the Generic Statistical Business Process Model (GSBPM).
Fourth: GSBPM Objectives and Usage
1. Improve and standardize the statistical processes used in the creation of statistics within both GASTAT and the statistical sector.
2. Facilitate collaboration and coordination across departments engaged in the statistics generating processes.
3. Provide a framework for assessing the quality of the statistical process, as well as a mechanism to facilitate the standardization of operational standards to the greatest extent possible, thereby promoting a culture of quality, accuracy, comprehensiveness, and the potential for highly productive statistics.
4. GSBPM enables the integration of international efforts on statistical metadata and data quality by providing a common framework and nomenclature to describe statistical processes.
5. Methodological criteria can be linked to the relevant phases or sub-processes and then classified and stored in a GSBPM-based structure.
6. Provide a tool to review and improve statistical classifications.
7. A starting point to develop processes aimed at deploying new products.
8. Enable governance of statistical activities and processes.
9. Create a professional and training resource for GASTAT›s and the statistical sector›s personnel.
10. Estimate the cost of different elements of the statistical process to determine the operational expenses, which can then be used to identify modernization efforts to enhance cost efficiency of the most expensive parts of the process.
11. Measure system performance by identifying components that are inefficient, redundant, or need to be replaced, and identifying gaps for which new components should be developed.
12. Provide ongoing assessment and development tools.
13. Provide a tool for harmonizing processes associated with non-statistical data providers (such as administrative and geospatial data), facilitate communication between statisticians and experts from other areas and coordinate related terminology.
Fifth: GSBPM Development Methodology
As previously indicated, the National Version was created in order to be compatible with GSBPM
v. (5.1). Certain details, however, were added to provide more explanation. Among the most important factors considered in developing this GSBPM version of the model are the following:
• The international GSBPM v. (5.1) in its entirety serves as a basis and a starting point for developing and producing this National Version.
• Naming the phases in line with the international model.
• Structural design of phases in line with the international model.
• The «phases» in the National Version, like the international version, are made up of «sub- processes».
• The set of «activities» under each sub-process in the international version was considered a third level in the National Version, with the inclusion of certain activities not found in the international version. The work will be made consistent with the results of the UNECE
«GSBPM Tasks» work-group.
Sixth: GSBPM Structure
The National Version of GSBPM consists of four levels, as follows:
• Level 0: Statistical business processes.
• Level 1: The eight phases of the statistical business process.
• Level 2: The sub-processes within each phase.
• Level 3: Activity/activities within each sub-process.
Although each sub-process in the international version contains a set of activities, it has not been treated as a third level. It did not have the same detail and clarification adopted as the National Version.
• The statistical business phases of the Model are broken down into eight related phases as follows
Figure 1: The phases (level 1) and sub-processes (level 2) of the GSBPM
|
1. 1. Phase (single-level number) |
• Phases and sub-processes are designed in a flexible manner. Some elements of the Model may fit a type of statistics more than others, while some elements may overlap in another product and be replicated until final approval is reached.
• The Model contains (8) phases and (44) sub-processes. The Model also describes a number of activities under each process, which describe the sub-processes in detail.
In addition to these levels, which are explained in Section Eight of this Version, Section Nine has been dedicated to the overarching processes that apply throughout the eight phases. These overarching processes include quality management, metadata management and data management:
1) Quality Management - This process includes quality assessment and control mechanisms. It recognizes the importance of evaluation and feedback throughout the statistical business process.
2) Metadata Management - Metadata is generated and utilized at each phase , necessitating an integrated system for its management to ensure the preservation of the relationship between data and metadata. This also includes the preservation of metadata, its ownership, and the rules for its archiving and disposal.
3) Data Management - This process encompasses aspects such as data security, ownership, quality, archiving and retention rules, and disposal procedures.
- Process Data Management - This includes activities related to recording, organizing, and utilizing data associated with the execution of the statistical process itself. Process data can help identify and understand patterns in the collected data and contribute to assessing the implementation of the statistical process as follows:
- Knowledge Management - This ensures that statistical business processes are repeatable, mainly through the maintenance of process documentation.
- Provider management – includes managing the burden on data providers, as well as issues of their classification and management of contact information. This process is closely related to statistical processes that rely on records.
Seventh: Applicability
1. To produce a statistical product for the first time, all eight phases must be used:
Figure (2): Phases of statistical business
2. Four phases are reviewed upon re-release of the statistical product:
• Fourth Phase (Collection)
• Fifth Phase (Processing)
• Sixth Phase (Analysis)
• Seventh Phase (Dissemination)
3. Each time a statistical product is produced, the number of reviews of the following phases should be determined:
• First Phase (Specify Needs)
• Second Phase (Design)
• Third Phase (Build)
4. GSBPM applies several overarching statistical processes throughout the eight phases, as well as other details pertaining to quality management, metadata management, and data management. Also overarching processes will be updated following specifications released by the UNECE Supporting Standards group.
5. The Model is designed to be applicable regardless of the data source, so it can be used for the description and quality assessment of processes based on surveys, censuses, administrative registers, and other non-statistical or mixed sources.
6. The sectioning process used in the Model allows the phases to be applied sequentially or in parallel depending on the type of statistical product.
Eighth: Phases of statistical business
| Phase | Description | |
| First | Specify needs | Understand and confirm statistical needs, and identify possible solutions. |
| Second | Design | Designing the processes and services needed to implement the procedures in the next phases. |
| Third | Build |
Build, organize, gather and test production solutions |
| Fourth | Collect | Collect and validate data. |
| Fifth | Process | Process data to be converted from raw to usable statistical data. |
| Sixth | Analyze |
Validate and interpret outputs. |
| Seventh | Disseminate | Prepare the products to be ready for dissemination. |
| Eighth | Evaluate | Conduct an evaluation of the statistical process. |
01 | Phase one: Specify needs
This phase aims to:
1. Understand the context around the specific need or change.
2. Identify customer needs and requirements.
3. Set clear limits and targets for the statistical product.
4. Identify the range of solutions to meet needs.
This phase is triggered in two cases:
1. A need for new statistics is identified.
2. Amendments to (current) statistical product initiates a review.
This phase consists of (6) sub-processes:
• First sub-process : Specify needs
• Second sub-process : Consult and confirm needs
• Third sub-process : Establish output objectives
• Fourth sub-process : Identify concepts
• Fifth sub-process : Check data availability
• Sixth sub-process : Prepare and submit business case
Sub-processes and activities in Phase one: Specify needs
1.1 First sub-process: Identify needs
This process is based on understanding customers› needs for the statistical product and reviewing observations and feedbacks on current statistics through certain procedures so that the actual need for the statistical product and the feasibility of its production or improvement in case the product exists, are identified.
• Activities of the Process:
1.1.1 Investigate and understand the specific statistical needs in order to consult with the customer.
1.1.2 Verify the background of the customers, the entity they represent and the reasons for the product›s demand or improvement if the product exists.
1.1.3 Identify the required statistics and objectives.
1.1.4 Identify priorities and challenges, and propose solutions to the customer.
1.1.5 Consider available statistical assets that might meet a customer›s need. Consider of practice amongst other (national and international) statistical organizations producing similar data and the methods used in those organizations.
1.1.6 Provide action plans from evaluations of previous iterations of the process or from other processes.
1.2 Second sub-process: Consult and confirm needs
• Activities of the Process:
1.2.1 Reach out and consul to customers and hold workshops and meetings necessary to gain a deep understanding of their needs.
1.2.2 . Clarify the concepts to be measured from the customers' perspective, and ensure they are consistent with existing statistical standards.
1.2.3 Confirm and document statistical needs.
1.3 Third sub-process: Establish output objectives
• Activities of the Process:
1.3.1 The objectives of the statistical output needed to meet user needs identified during consultation with customers (Process 1.2), as well as details of the required products and services; including time and quality, and expected outputs, including cross tabulation, diagrams, quality standards, priorities and expected challenges. Some restrictions are likely to appear when establishing output objectives, such as legal frameworks (e.g. relating to confidentiality), and available resources.
1.3.2 Align with stakeholders and agree on the appropriateness of proposed outputs and their quality metrics.
1.4 Fourth sub-process: Identify concepts
• Activities of the Process:
1.4.1 Identify and define concepts that will be measured (these concepts are the same as those identified by the customer).
1.4.2 Align the required concepts with existing statistical standards whenever possible.
1.5 Fifth sub-process: Check data availability
• Activities of the Process:
1.5.1 Research the availability of data and methodologies to help develop solutions that meet statistical needs.
1.5.2 Identifying data sources that may be suitable for statistical purposes, including statistics, registers, frames, processing and analysis models, as well as the quality of those data, their availability timeline, security, and continuity of supply, among other aspects.
1.5.3 Observe any restrictions or conditions regarding available data sources.
1.5.4 Coordinate with potential providers of administrative data to discuss the availability and quality of the data in relation to the concepts being measured, and to define the division of responsibilities between the data providers and GASTAT.
1.5.5 Check ICT resources (e.g. data storage, technology required to handle incoming data and data processing) as well as any formal agreements with data providers for accessing and sharing the data (e.g. formats, delivery, accompanying metadata and quality check).
1.5.6 Collect information about legal frameworks that affect the extent to which that data is used.
1.5.7 Consider the common practices of international statistical agencies, and other agencies that produce similar data.
1.6 Sixth sub-process: Prepare and submit business case
The outputs of this sub-process are the future design and development decisions of the statistical product and their presentation to the customer approval to proceed to design phase.
• Activities of the Process:
1.6.1 Conduct feasibility studies based on the data from the previous sub-processes of this phase before suitable solution options can be proposed (sometimes there may not be suitable solution options that can adequately meet identified needs).
1.6.2 Propose initial solution options, including:
- Propose one or more of the initial solution options to the concerned GASTAT committees, to meet previously confirmed statistical needs.
- Present a set of documented and estimated solution options, and review them after evaluating their feasibility and applicability.
- Suggest ways to fill any remaining gaps (compile new outputs from existing data sources, conduct analyses, develop new frameworks or classifications and conduct a new survey). This may include identifying possible partnerships with data holders. This sub-process also includes a more general assessment of the legal framework in which data would be collected and used, and may therefore identify proposals for changes to existing legislation or the introduction of a new legal framework.
1.6.3 Prepare accurate cost calculations of possible solution options to assess their feasibility for identifying and evaluating possible solution options.
1.6.4 Support the business case by stating all sufficient details to provide the relevant reviewers and stakeholders with:
a. A clear understanding of what will be produced.
b. Justification and method of production (detailed).
c. Cost, risks and timing.
Note: The level of detail in the business case needs to be commensurate with the expected cost, complexity, risk, business importance, as well as the quality of the required outputs.
Note: A decision, either approval or rejection, is made, and the feasibility study is typically reviewed and formally approved or declined by the authorized decision-maker and governance committees.
02 | Phase two: Design
This phase is triggered in two cases:
1. Design a new product depending on the «Specify Needs» phase, understanding and confirming statistical needs, identifying possible solutions.
2. Redesign the product for improvement and development based on the outputs of the Eighth Phase (Evaluate).
This phase aims to:
1. Develop detailed practical solutions for statistical product design (including all identification work required prior to automation and data collection).
2. Apply a range of common reusable methodologies and instruments using international and national standards to reduce the length and cost of the design process and enhance the comparability and usability of outputs.
This phase consists of (6) sub-processes:
• First sub-process : Design outputs.
• Second sub-process : Design variable descriptions
• Third sub-process : Design collection
• Fourth sub-process : Design frame and sample
• Fifth sub-process : Design processing and analysis
• Sixth sub-process : Design production systems and workflow
Sub-processes and Activities in Phase two: Design Phase
2.1 First sub-process: Design outputs
This sub-process contains the detailed design of the statistical outputs, products and services to be produced, including the related development work and preparation of the systems and tools used in the «Disseminate» phase.
• Activities of the Process:
2.1.1 Processes governing access to any confidential outputs are also designed here.
2.1.2 Outputs should be designed to follow existing standards wherever possible, so inputs to this process may include metadata from similar from the same outputs in previous cycles. This is to compare results between different versions or to clarify methodological differences or other changes, such as changes in classifications. (Examples of metadata for outputs include: description of the indicator or table, statistical unit, classifications, reference period, methodology, data quality, etc.).
2.1.3 Outputs may also be designed in partnership with other interested bodies, particularly if they are considered to be joint outputs, or they will be disseminated by another organization.
2.1.4 Identify mechanisms for providing access to detailed data through dissemination channels.
2.1.5 Identify the timing of the release.
2.1.6 Identify information security procedures.
2.1.7 Identify the quality metrics that are supposed to be disseminated.
2.1.8 Identify the statistical product promotion plan.
2.2 Second sub-process: Design variable descriptions
This sub-process includes:
- Define the variables collected through data collection instruments, as well as any other variables that will be derived in the Fifth Phase «Process».
- Define the units, classifications, standards and statistical frameworks that are used (existing national and international standards).
• Activities of the Process:
2.2.1 Identify variables, including:
- Identify and define variables to be used to measure specific phenomena (variables to be collected by data collection instruments, as well as any other variables derived from them, ideally for processing, statistical analysis or dissemination).
2.2.2 Define classifications, standards and statistical frameworks, including:
- Define of classifications, standards and statistical frameworks through which statistical data is classified.
2.3 Third sub-process: Design collection
This sub-process aims to identify, document, and test data sources, methodologies, tools, and supporting materials to efficiently and effectively collect the required data from providers, and to redevelop and revise them where necessary before adopting them.
The sub-process activities in this phase also vary depending on the data collection instrument used. These tools may include computer-assisted interviews, paper questionnaires, administrative records, data transmission methods, web data extraction techniques, and geospatial data technologies.
• Activities of the Process:
2.3.1 Define the method of data collection, including:
- Identify data sources and methods of collection.
- Determine the data collection method (online inquiries, file transfers, CAPI computer- assisted personal interviews, CATI computer-assisted telephone interviews or CAWI computer-assisted web interviews, etc.).
- Develop a plan to test and validate data collection method (content testing, sequence, usability, security, etc.)
- Develop release and dissemination plans, intensive follow-up and reconnection with data provider, and training requirements.
2.3.2 Identify and design data collection tools and supporting materials, including:
- Identify and design the supporting tools and materials used to collect data such as (forms and questionnaires).
- Identify metadata on the content of questionnaires (questions, sequence, interviewer’s instructions, modifications, online help, etc.).
- Identify metadata for administrative data collection tools as a set of requirements using standard models.
- Preparing the necessary programs related to data collection tools.
2.3.3 Approval of the method, tools, data collection, and metadata materials resulting from other processes, and supporting materials by the relevant authorities and the Authority’s committees for implementation in the fourth phase “Collection».
2.4 Fourth sub-process: Design frame and sample
This sub-process is concerned with identifying and improving the plan of the statistical framework and the survey sample and testing it for validation, and finding mechanisms to combine data sources (administrative and statistical records) for use within the statistical framework.
• Activities of the Process:
2.4.1 Identify sampling units, including:
- Identify and define statistical community units from which data is collected.
- Identify and define the required units to ideally analyze, process and disseminate data (analysis units).
2.4.2 Design the statistical frames plan, including:
- Design and document the optimal plan for creating statistical frameworks, to cover the target statistical community.
- Review lists, maps and analytical specifications of units from which data providers can be selected
- Use shared statistical records and frameworks as much as possible, and describe how to combine shared sources for the sample framework, if any, such as: Administrative and statistical records, censuses and information from other survey samples, which may include geospatial data and classifications.
- Identify the metadata required to establish the statistical and testing framework, validate and approve its use for the current survey cycle.
- Establish the sample framework later in the Fourth Phase «Collect».
2.4.3 Design the sampling plan, including:
- Design and document the optimal plan to select a sample of the units from which the data are collected.
- Identify the most appropriate sampling methodologies, with a view to providing output with the required quality, with a minimum burden on data providers. The burden on the data provider is usually reduced by using rotation techniques and controlling for overlap between survey samples.
- Create the actual sample will be later in the fourth phase. «Collect».
2.5 Fifth sub-process: Design processing and analysis
This sub-process designs of processing and statistical analysis plans which will be carried out during the Fifth Phase «Process» and Sixth Phase «Analyze». These plans include:
- Specifications and mechanisms of processing and analysis.
- Rules and standards for data linkage, coding and imputation for lost data.
Aspects of these plans may need to be revised during implementation, particularly when they are not performing as expected once the actual data for the statistical product are available.
• Activities of the Process:
2.5.1 Design data linkage plan, including:
- Design data integration specifications from multiple sources.
- Design methods to control statistical disclosure, to remove anything indicative of identity.
2.5.2 Design data auditing and estimation processes, Including:
- Design optimal models to audit and estimate data through:
- Estimate target statistical communities and variables for which there are no directly available data sources for the respective time period.
- Use statistical analysis methods and available test data (such data from records, and similar data from other sources).
- Estimates extracted through sample weighting processes.
- Seasonally adjusted expectations and estimates.
2.5.3 Design data validation plan, including:
- Design and document the optimal plan for data coding and validation used to infer the outputs (e.g. data inputs) as well as the outputs themselves.
- Validate data against expectations to ensure suitability for use.
- Analyze, process, interpret and explain data, maintain confidentiality and revise outputs.
- Select and test a group of methods and processes, validate and finalize them prior to implementation in the Fifth Phase «Process» and Sixth Phase «Analyze», to ensure that the data validation process is both efficient and effective.
2.6 Sixth sub-process: Design production systems and workflow
This sub-process designs the procedures needed for the statistical product by setting an overview of all the required activities and the stakeholders involved in whole production process and ensuring that they fit together efficiently with no gaps or redundancies. This sub-process also defines the various systems and databases required during the process, identifies needs and assesses the appropriateness of the solutions of the existing systems, as well as how and when staff interact with the systems, and who will be responsible for what and when.
• Activities of the Process:
2.6.1 Assess the suitability of existing system solutions, including:
- Evaluate existing.
- Identify the requirements for new solutions when needed.
- Address new requirements through GASTAT governance committees’ meetings; and ideally promoted as shared services.
03 | Phase three: Build
«Build» as a phase in statistical business production: Build, organize, gather and test production solutions and systems so that they are ready for use in statistical implementation processes. This includes the development and improvement of data collection, processing, analyzing and dissemination systems by taking advantage of previous cycles of the statistical process. This phase is reviewed every time a statistical product is produced through:
• Compile statistical business procedures needed for the project in terms of shared processes and services.
• Design, create, test, manage and organize shared services as institutional shared resources across all departments entrusted with statistical business. Organizers must use simulation or test data to ensure that their processes and flows are in the right direction.
This phase consists of (7) sub-processes:
• First sub-process : Reuse or build collection instruments
• Second sub-process : Reuse or build processing and analysis components
• Third sub-process : Reuse or build dissemination components
• Fourth sub-process : Configure workflows
• Fifth sub-process : Test production systems
• Sixth sub-process : Test statistical business process
• Seventh sub-process : Finalize production systems
Sub-processes and Activities in Phase three: Build Phase
3.1 First sub-process: Reuse or build collection instruments
This sub-process describes the activities to build and reuse the collection instruments to be used during the «Collect» phase.
• Activities of the Process:
3.1.1 The collection instruments are built based on the design specifications created during the third sub-process of the «Design» phase.
3.1.2 Prepare and test the contents and functioning of the collection instrument (e.g. cognitive testing of the questions in a questionnaire).
3.1.3 It is recommended to consider the direct connection of collection instruments to a metadata system, so that metadata can be more easily captured in the collection phase. This procedure can save work in later phases. Capturing the metrics of data collection (paradata) is also an important consideration in this sub-process for calculating and analyzing process quality indicators.
3.2 Second sub-process: Reuse or build processing and analysis components
This sub-process describes the activities to reuse existing components or build new components needed for the «Process» and «Analyze» phases.
• Activities of the Process:
3.2.1 Reuse existing components or build new components needed for the «Process» and
«Analyze» phases, as designed in the fifth sub-process of the «Design» phase. They may include: Dashboard functions and features, transformation functions, geospatial data services, provider and metadata management services.
3.3 Third sub-process: Reuse or build dissemination components
This sub-process describes the activities to build new components or reuse existing components needed to disseminate statistical outputs.
• Activities of the Process:
3.3.1 Reuse existing components needed to disseminate statistical outputs or build new components as designed in the «Design» phase, to provide dissemination via web services, open data outputs, geospatial statistics, maps, or individual data access.
3.4 Fourth sub-process: Configure workflows
This sub-process simulates the preparation of statistical workflow procedures to be tested in the next sub-process.
• Activities of the Process:
3.4.1 Configure and test workflow mechanism and functions, as well as systems used within the business processes, from data collection through to dissemination based on the design created in sixth sub-process of the design phase (Design production systems and workflows).
3.4.2 Modify workflow for a specific purpose, assembling the workflows for the different phases together and configuring systems accordingly.
3.5 Fifth sub-process: Test production systems
In this sub-process, statistical workflow procedures are tested to launch the project in the production environment.
• Activities of the Process:
3.5.1 Test the assembled services in the design phase and newly configured services as a result of gaps in the current service list. This includes technical testing and sign-off of new programs and routines.
3.5.2 Confirm that existing routines from other statistical business processes are suitable for use in this case.
3.5.3 Test of interactions between assembled and configured services, and ensure that the whole production solution works in a coherent way.
3.6 Sixth sub-process: Test statistical business process
• Activities of the Process:
3.6.1 Manage a field test or pilot of the statistical business process. Typically, it includes a small-scale data collection, to test the collection instruments, followed by processing and analysis of the collected data, to ensure the statistical business process performs as expected. It may be necessary to go back to a previous sub-process and make adjustments to collection instruments, systems or components of the major statistical business process.
If testing of some of the data collected is required on a very large scale (e.g. a comprehensive census pilot) this test may be repeated until the process is completed satisfactorily.
3.7 Seventh sub-process: Finalize production systems
• Activities of the Process:
3.7.1 Put the assembled and configured processes and services, including modified and newly-created services, into production ready for use, including:
- Producing documentation about the process components, including technical documentation and user manuals.
- Training the employees on how to operate the process.
- Moving the process components into the production environment and ensuring they work as expected in that environment.
• Generally, in this «Build» phase, testing, metadata revision, or other features may be repeated to configure statistical workflow procedures before they are ready to be approved for launch.
3.6 Sixth sub-process: Test statistical business process
• Activities of the Process:
3.6.1 Manage a field test or pilot of the statistical business process. Typically, it includes a small-scale data collection, to test the collection instruments, followed by processing and analysis of the collected data, to ensure the statistical business process performs as expected. It may be necessary to go back to a previous sub-process and make adjustments to collection instruments, systems or components of the major statistical business process.
If testing of some of the data collected is required on a very large scale (e.g. a comprehensive census pilot) this test may be repeated until the process is completed satisfactorily.
3.7 Seventh sub-process: Finalize production systems
• Activities of the Process:
3.7.1 Put the assembled and configured processes and services, including modified and newly-created services, into production ready for use, including:
- Producing documentation about the process components, including technical documentation and user manuals.
- Training the employees on how to operate the process.
- Moving the process components into the production environment and ensuring they work as expected in that environment.
• Generally, in this «Build» phase, testing, metadata revision, or other features may be repeated to configure statistical workflow procedures before they are ready to be approved for launch.
04 | Phase four: Collect
At this phase, all required information (data, metadata, and paradata) is collected using a
variety of collection methods including outputs from sample surveys, and administrative
sources. These are compiled into databases and subsequently made available for extraction by internal units during the fifth phase, «Process» (under GASTAT’s security policies).
This phase does not include any transformations of the assembled data (unless data loading is needed), as these are all done in the Fifth Phase «Process».
The «Collect» Phase is broken down into four sub-processes covering a number of activities that can occur in parallel. In particular, collection, loading and integration verification activities can be iterative.
This Phase consists of (4) sub-processes:
• First sub-process: Create frame and select sample
• Second sub-process: Set up collection
• Third sub-process: Run collection
• Fourth sub-process: Finalize collection
Sub-processes and Activities in Phase four: Collect Phase
4.1 First sub-process: Create frame and select sample
This sub-process marks the beginning of the operational phase of the statistical product. It focuses on the creation of the statistical frame and the selection of the sample. It also ensures the readiness of data collectors and providers, as well as the preparedness of technical systems and processes such as web-based applications and GPS for collecting data and metadata using the methods designed during the design phase, including strategy, planning, and training activities.
If the statistical process is repeated regularly, some or all of these procedures may not need to be repeated each time iteration .
• Activities of the Process:
4.1.1 Create frame:
- Establish the collection process frame as defined in Activity 2.4.2 Design the statistical frames plan».
- Validate and approve the use of the statistical frame once it is established.
4.1.2 Select sample:
- Select the collection process sample as defined in Activity 4.4.3 «Design the sampling plan».
- Validate and approve use of the selected sample. This procedure is not usually relevant for processes based entirely on the use of pre-existing sources (e.g. administrative data ) However, variables from administrative and other non-statistical sources can be used as auxiliary variables in the construction of sampling design.
- Verifying the readiness of the selected sample for subsequent data collection activities.
- Reviewing and updating respondent’s data (e.g., contact or housing details).
4.2 Second sub-process: Set up collection
This sub-process ensures that the people, processes and technology (e.g. web-based applications, GPS system) are ready to collect data and metadata, in all modes as designed during the Design Phase, including the strategy, planning and training activities.
• Activities of the Process: (For survey data):
4.2.1 Prepare a collection strategy.
4.2.2 Identify workforce needs to meet requirements during the various phases of the project.
4.2.3 Conduct nomination, training and employment processes for the workforce to ensure a sufficient number of well-trained personnel is available at the appropriate time to carry out the required work.
4.2.4 Training collection staff.
4.2.5 Develop a training system that uses supervised machine learning techniques.
4.2.6 Ensuring collection resources are available (e.g. laptops, collection apps, APIs).
4.2.7 Agreeing on terms with any intermediate collection bodies, (e.g. for computer assisted telephone interviewing, web services).
4.2.8 Configuring collection systems to request and receive the data.
4.2.9 Ensuring the security of data to be collected.
4.2.10 Preparing collection instruments (e.g. printing questionnaires, pre-filling them with existing data, loading questionnaires and data onto interviewers› computers, APIs, web scraping tools).
4.2.11 Providing information to respondents (such as drafting messages or brochures that explain the purpose of the questionnaire and notifying respondents when data collection instruments are made available online).
4.2.12 Translating of materials (e.g. into the different languages spoken or used in the country).
• Activities of the Process: (For non-survey data):
This sub-process ensures that the necessary processes, systems and confidentiality procedures are in place, to receive or extract the necessary information from the source. This includes:
4.2.13 Examining and evaluating requests for data.
4.2.14 Initiating communication with stakeholders data providers and sending an introductory package containing details about the data acquisition process.
4.2.15 Checking detailed information about files and metadata with the data provider and receiving a sample of the data to assess if data are fit for use.
4.2.16 Arranging secure channels for data transmission.
4.3 Third Sub-process: Run Collection
This sub-process aims to get data from its multiple and secure channels using the relevant collection tools, initially verifying the integrity of the collected data, and managing the relationship between GASTAT and data providers to ensure that it remains positive.
• Activities of the Process:
4.3.1 Data request, including:
- Sending notifications to respondents and communicate with them for the purpose of fulfilling the required data.
- Sending notifications to providers of administrative or other non-statistical data to request data in accordance with the agreements outlined in sub-process "2.3" (Designing the Data Collection Process). These data requests may be made electronically, for example, via secure email.
- Sending follow-up reminder messages in case the expected data is not received.
4.3.2 Receiving data, including:
- Receive the required data using the methods agreed upon in the data collection plan (e.g.: CAPI computer-assisted personal interviews, CATI computer-assisted telephone interviews, CAWI computer-assisted web interviews or other data collection methods)..
- Geo-coding may need to be done at the same time as collection of the data by using inputs from GPS systems; in addition to the other coding from the used statistical classifications.
- Receiving electronic files, knowing that the actual loading of relevant temporary or permanent databases is dealt with in sub-process 4.4 below.
Monitoring and controlling data collection and making any changes necessary to improve data quality.
4.3.3 Reviewing the integrity of collected data:
- Reviewing the integrity of datasets collected through questionnaires by verifying that acceptable response targets are met (e.g., evaluating response rate acceptability).
- Reviewing the integrity of administrative data by comparing the volume and content of the data and checking the format of received files to ensure they meet expected objectives.
- Reviewing the integrity of metadata to ensure correct uploading (e.g., avoiding negative values where positive ones are expected, and preventing errors in data formats).
4.3.4 Monitoring data providers and collectors:
- Resolving and addressing discrepancies identified through desk review processes .
- Communicating directly with data providers to help resolve and address
discrepancies found during desk reviews, and to obtain missing data that has not yet been received.
- Responding to inquiries related to samples.
- Managing workforce performance by regularly providing feedback, recognizing and rewarding high-quality performance, effectively addressing underperformance, and so on.
- Preparing regular preliminary reports on the project workforce.
- Note: The quality of the data provided is not verified in this process, as this is done in sub- process 5.3 (Review and validate).
4.4 Fourth sub-process: Finalize collection
This sub-process is concerned with uploading collected data and metadata, archiving, and prepares it for use in the activities of the Fifth Phase «Process».
• Activities of the Process:
4.4.1 Loading data and metadata to databases:
- Loading all collected data and metadata into the relevant databases (information may be loaded before or after confirmation of integrity, and may include reloading new versions of information received after information is pre-accounted for and stored. Data will be obtained as long as it is acceptable).
- Storing data in a temporary location before transferring and loading it to a permanent database (this applies to survey and administrative data).
- Load information associated with queries and contacts of data providers.
4.4.2 Archiving physical data collection tools:
- Archiving physical data collection tools (e.g.: paper questionnaires) and disposal after being electronically stored.
4.4.3 Approving collected data, including:
- Approving data collected externally to move forward to the next processing procedures.
- Preparing and using reports containing relevant standards for data collection to review the approval process.
05 | Phase five : Process
The collected data is processed at this phase by converting it into a set of statistical outputs for two purposes:
1. Statistical analysis that reveals broader understanding of data.
2. Disseminate to customers using various rules or forms approved for dissemination.
• This phase occurs in each iteration of statistical production.
• The processes and activities of this phase can be applied to data derived from statistical and non-statistical sources with the exception of the sub-process (calculate weights) which is usually specific to survey data.
• The «Process» and «Analyze» phases may also commence before the «Collect» phase is completed. This enables the compilation of provisional results and extend the time available for validation.
• This phase can be done in parallel and can be iterative with the Sixth Phase «Analyze» Analysis can reveal a broader understanding of the data, which might make it apparent that additional processing is needed.
• The difference between data validation and editing procedures at both the «Process» and
«Analyze» phases:
• At the Process Phase: Validation of data entered into the models and used in the production of the outputs.
• At the Analyze Phase: Validation is made for post-processing outputs.
This phase consists of (8) sub-processes:
• First sub-process : Integrate data
• Second sub-process : Classify and code
• Third sub-process : Review and validate
• Fourth sub-process : Edit and impute
• Fifth sub-process : Derive new variables and units
• Sixth sub-process : Calculate weights
• Seventh sub-process : Calculate aggregates
• Eighth sub-process : Finalize data files
Sub-processes and Activities in Phase five: Process Phase
5.1 First sub-process: Integrate data
This sub-process integrates data from one or more sources. The input data can be from a mixture of external or internal sources, and a variety of the collection sources, including surveys and administrative and other non-statistical data. Administrative data or other non-statistical sources of data can substitute for some of the variables directly collected from survey.
Data sets may need to be linked several times during this «Process» phase in order to increase the quality of the entered data.
• Activities of the Process:
5.1.1 Extraction and compilation of data, including:
- Extracting or compiling of data records from relevant databases, allowing greater effectiveness of monitoring certain phenomena. Obtaining integrated statistics such as national accounts.
- Combining spatial demographic data with statistical or other non-statistical data.
- Preparing the linking process or other organizing activities.
5.1.2 Data matching, including:
- Matching or integrating multiple datasets that are required for organization or other audit activities (e.g. Extracting statistical accounts and analyses and validating inputs or outputs), whether under complete matching or statistical matching:
• Complete matching: It is the process of linking records from two different sources based on a common, identical identifier (such as an ID number, facility number, or tax number).
• Statistical matching: It is the process of linking data sets without an explicit common identifier, based on similar characteristics or attributes (such as gender, age, geographic region, education).
5.1.3 Prioritizing data, including:
- Prioritizing data when two or more data sources contain data on the same variable (with potentially different values), this is according to rules that are supposed to be determined in advance, such as the quality of the source, including (accuracy, completeness, modernity, consistency with other sources or related variables), trust in the source, the analytical goal (the most accurate source may be inappropriate due to the time gap or bias), or following a fixed internal policy such as adopting the registry data first and then the survey results.
5.1.4 Anonymizing data, including:
- Clearing all personal identifiers from linked or unlinked datasets to ensure compliance with legal requirements that protect individuals› privacy.
- Stripping individual›s name, address and other identifiers from the collected survey data and from administrative files as soon as possible after the process. (After all linking processes are completed).
• Note: Removing identifiers from datasets (usually input datasets) is different from producing confidential outputs in sub-process 6.4 (Apply disclosure control), where different methods are used and the focus is on the produced output.
5.2 Second sub-process: Classify and code
This sub-process is concerned with the classification and coding of input data. Coding procedures may assign numerical codes to coded fields based on a predefined classification system.
• Activities of the Process:
5.2.1 Automatic classification and coding of values, including:
- Assigning all categories.
- Coding variable values using automated methods (in bulk or immediately during the review of individuals› responses).
- Re-assigning or re-encoding if necessary (e.g. coding based on geographic data, economic and professional activities.
5.2.2 Classifying and coding values not automatically classified, including:
- Identifying instances in which codes cannot be automatically assigned to variables.
- Assigning or re-assigning these instances clerically (e.g. if available information increases).
- Use of clerical coding is kept to a minimum (e.g. Assigning industrial and professional codes where descriptions cannot be automatically assigned using search tables).
5.3 Third sub-process: Review and validate
This sub-process examines data to identify potential deviations, problems, errors and discrepancies found in the inputs of the «Process» phase, and adjusts and addresses them to be entered in the forms designed to calculate weights and aggregates in Sub-process 5.5.
Deviations can be defined as certain errors in data, or data of a questionable nature which may arise from wrong classifications in coding, data loss, and copying errors, etc.
This sub-process aims to:
1. Ensuring that these inputs are «valid for the purpose for which they were collected» by improving them up to acceptable quality priorities as effectively as possible.
2. Identifying and modifying automated data, including imputation of missing or unreliable data.
It may be run iteratively, validating data against pre-defined edit rules, usually in a set order. It may flag data for automatic or manual inspection or editing. Reviewing and validating can apply to data from any type of source, before and after integration, as well as imputed data from the next sub-process (Edit and impute). Whilst validation is treated as part of the
«Process» phase, in practice, some elements of validation may occur alongside collection activities, particularly for modes such as computer assisted collection. Whilst this sub-process is concerned with detection and localization of actual or potential errors, any correction activities that actually change the data is done in the sub-process (Edit and impute).
The validation process may be applied at different times during the project (e.g. before or after the data linking, or after changing the entered data upon validating the outputs in sub- process «6.2 validate outputs»).
• Activities of the Process:
5.3.1 Detecting and addressing missing inputs and outliers, including:
- Detecting deviations in the entered data, whether those that are suitable for automatic correction or those that must be manually modified and processed, using rules or mathematical solutions, and identifying the causes of those deviations whenever possible.
- Applying methods of imputing missing or unreliable data.
- Perform processing of the data that needs to be processed in this cycle, comparing it with the processing methods used in previous cycles for the same data.
5.4 Fourth sub-process: Edit and impute
• Activities of the Process:
5.4.1 Identifying appropriate methods to address outliers.
5.4.2 Applying processing by changing or modifying the data. If deviations result from changing or modifying some data, the traditional form of processing is to leave the data unchanged.
5.4.3 Documenting the causes of deviation according to the phases of Design, Collect or Run in a previous activity.
5.4.4 Addressing outliers, and flagging them as changed.
5.5 Fifth sub-process: Derive new variables and units
This sub-process derives data for new variables and units that are necessary to deliver the required outputs.by applying arithmetic formulae to one or more of the variables that are already present in the dataset, or applying different model assumptions.
This activity may need to be iterative during the Fifth Phase «Process» prior to certain activities such as data linking, coding, validation and obtaining outputs, as some derived variables may themselves be based on other derived variables. It is therefore important to ensure that variables are derived in the correct order. New units may be derived by aggregating or splitting data for collection units, or by various other estimation methods.
• Activities of the Process:
5.5.1 Extracting new classes and units, including:
- Deriving new statistical communities from aggregated data (e.g. individuals under the age of 15 with disabilities).
- Extracting values for new statistical units by aggregating or dividing data into collection or reporting units, or through various other aggregation methods (example: extracting households when the collection units are people, such as grouping the number 7 barrels of oil to equal 1 ton).
5.5.2 Deriving new variables, including:
- Deriving values for new variables necessary to aggregate the required outputs, or to validate them by applying arithmetic formulae to one or more of the variables that are already present in the dataset.
- Applying extraction processes to datasets either before or after data linkage, as some derived variables may themselves be based on other derived variables.
- Ensure that variables are extracted in the correct order, as follows:
1. Ensure the accuracy and quality of the original variables.
2. Address outliers or missing values, if any.
3. Derive new variables.
4. Check consistency between variables (examples of deriving new variables are: deriving the age variable from the variables date of birth and date of interview, deriving monthly income from weekly or annual income, and deriving the number of individuals under 10 years of age from a list of individuals with their ages).
5.6 Sixth sub-process: Calculate weights
This sub-process creates weights for sampling unit data according to the methodology developed in sub-process 2.5 (Design processing and analysis). It also converts certain statistical inputs into different outputs using specific and tested modeling methods when designing the methodology under activity «2.6.3 design data validation plan».
• Activities of the Process:
5.6.1 Creating initial weights for sample units records according to the methodology developed in sub-process 2.5 (Design processing and analysis). They are based on the probability of selecting each unit in the sample. (For example, if the probability of selecting a person from the population is 11000/, then their initial weight will be 1000).
5.6.2 Adjust weights for non-response (if applicable) to compensate for units (categories) that did not respond. For example, if a particular category of the population recorded a low response rate, the weight is adjusted by increasing the weight of individuals who responded from that category.
5.6.3 Calibrating the weights (if required) by adjusting them so that the overall characteristics of the sample match known characteristics of the population (from the population census, for example, for household surveys) or (from reliable record data, such as data on the number of employees or registered capital in economic or industrial records, for example, for establishment surveys).
5.7 Seventh sub-process: Calculate aggregates
Examples of outputs from this sub-process include the creation of aggregated data as population totals from individual data or lower-level aggregates, and the calculation of corresponding sampling errors, administrative data aggregates, accounts, size scales, indicators, time series, forecasts or predictions, error measurements, averages, dispersion and quality standards.
• Activities of the Process:
5.7.1 Calculating the totals for the main variables or auxiliary variables, taking into account the calculation of weights in sub-process 5.6.
5.7.2 Deriving aggregates (other estimates) such as rates, ratios, and means.
5.8 Eighth sub-process: Finalize data files
• Activities of the Process:
5.8.1 Loading statistical outputs (data and metadata) to relevant output databases. To be inputs for the next phase, "analysis." Sometimes the outputs may not be final, especially when there are strong time pressures and conditions for producing initial and final estimates.
06 | Phase six: Analyze
This phase is concerned with the detailed examination and validation of statistical outputs against expectations and Relevant documents are prepared and content is developed for dissemination. This sub-process includes activities that enable statistical analysts to understand and interpret the resulting outputs.
This phase is done in each cycle for statistical outputs produced regularly or once, Its outputs include: Statistics, relevant metadata and comments for both internal and external release.
The outputs of this phase can also be used as input for other sub-processes (e.g., using new source analysis as input for the «Design» phase).
The «Analyze» phase includes all statistical outputs, regardless of the source of the data.
This Phase consists of (5) sub-processes:
• First sub-process : Prepare draft outputs
• Second sub-process : Validate outputs
• Third sub-process : Interpret and explain outputs
• Fourth sub-process : Apply disclosure control
• Fifth sub-process : Finalize outputs
Sub-processes and Activities in Phase six: Analyze Phase
6.1 First sub-process: Prepare draft outputs
• Activities of the Process:
6.1.1 Collecting factual information about the statistical scope (statistical filed) by making the most of the common database available at GASTAT.
6.1.2 Prepare draft outputs from the «Calculate weights and totals» and «load outputs to databases» processes in the «seventh and eighth sub-process» of the «Process Phase».
6.2 Second sub-process: Validate outputs
This sub-process validates the quality of the produced outputs in accordance with a general quality framework and customers’ expectations. The team entrusted with the procedures of this sub-process must have an in-depth understanding, review, analysis and interpretation of the statistical outputs from many perspectives using different tools and methods for validation. These aspects include comparing statistics with previous cycles (as far as possible), comparing statistics with other relevant data (internal and external), examining statistics or analyzing them into the component parts, deriving patterns or key trends from the data and reviewing quality standards related to the data.
• Activities of the Process:
6.2.1 Detecting and addressing outlier outputs that can be automatically corrected, including:
- Predetermining the correction rules for detecting outlier outputs.
- Detect anomalies in the data by applying validation rules.
- Checking that the coverage and response rates are as required.
- Comparing the statistics with previous cycles to validate data.
- Checking that the associated metadata, paradata and quality indicators are present and in line with expectations.
- Checking geospatial consistency of the data.
- Comparing the statistics against other relevant data (both internal and external).
- Investigating inconsistencies in the statistics.
- Validating the statistics against expectations and domain intelligence.
- This procedure is usually reiterative every time a new version of the outputs changes.
6.2.2 Analyzing outputs and adjusting deviations, including:
- Statistical analysis of outputs using statistical analysis methods, presentation methods (tables, graphs, maps) and tools ideally using a group of internal and external data sources.
- Addressing outputs that are different from expectations to ensure that they are correct and valid for release. (If there are justifiable and explainable deviations from expectations, the outputs will remain unchanged.)
- Applying a group of methods to address any deviations.
- Documenting all addressing decisions to be a reference in the future (as a rule and quality).
- Addressing data deviation from the source (e.g. price quality adjustments, derivation processing, etc.).
6.3 Third sub-process: Interpret and explain outputs
This sub-process aims to interpret data outputs, where comments and relevant supporting documents explaining these outputs are compiled. Documents and reports related to statistical analyses can be used even if they have been prepared for the internal approval processes of GASTAT.
• Activities of the Process:
6.3.1 Preparing clarifications and documents for release:
- Prepare and explain statistical comments and interpretations to support the data being released. Such comments may be in the form of text, presentations, table, graph, or map.
- Preparing official documents using GASTAT approved forms. To assemble the product in sub-process 7.2 "Producing Publishing Products".
6.4 Fourth sub-process: Apply disclosure control
This sub-process ensures that the data and metadata to be disseminated do not breach the
confidentiality rules according to GASTAT dissemination policies, or to the process-specific
methodology created in sub-process 2.5 (Design processing and analysis).
This may include checks for primary and secondary disclosure and dissemination, as well as the application of data anonymization or obfuscation techniques and output inspection.
• Activities of the Process:
6.4.1 Monitoring and disclosure of confidential data:
- Using statistical analysis methods and means of presentation for disclosure of confidential data (similar to those used in activity «6.2.2 analyze outputs and adjust deviations»).
6.4.2 Handling confidential data:
- Apply automated or manual control methods to anonymize the data or to obfuscate it prior to disclosure and publication.
- Examining the outputs after applying this process.
• Methods applied to control, slight modification or deletion of data may limit the amount of detail available, requiring permission from the data provider to disseminate it.
• The degree and method of statistical disclosure control may vary for different types of outputs. For example, the approach used for microdata sets for research purposes will be different to that for published tables, finalized outputs of geospatial statistics or visualizations on maps.
6.5 Fifth sub-process: Finalize outputs
This sub-process aims to approve data, metadata, and comments that interpret the data and load them to dissemination databases after they reach the required quality level.
• Activities of the Process:
6.5.1 Convening internal meetings to discuss approval:
- Meetings between statisticians responsible for the production of outputs and relevant specialized experts at GASTAT will be held to discuss the quality of data outputs and verification of consistency.
6.5.2 Approving content for release, including:
- Translating statistical content prepared for release into English including metadata and comments (text and presentations) and other supporting technical documents.
- Approving statistical content prepared for release (in both Arabic and English) including metadata and comments (text and presentations) and other supporting technical documents.
6.5.3 Checking data outputs with customers:
- Reviewing data outputs on customers’ requests by taking their views on the data outputs prior to dissemination.
• Note: This activity is specific to the surveys carried out by GASTAT at the request of beneficiary parties.
6.5.4 Approval of the statistical product outputs by the competent authority.
07 | Phase seven: Disseminate
This phase focuses on compiling, organizing, and releasing statistical products and content
delivered to clients through the dissemination service. Products and services are generated
from approved statistical content that is ready for release and has been uploaded to secure
dissemination databases.
This phase is carried out for each cycle of statistical outputs, whether produced regularly or as
a one-time release, and its outputs include:
Any complete statistical product published through GASTAT’s approved dissemination channels.
This Phase consists of (5) sub-processes:
• First sub-process: Update output systems
• Second sub-process: Produce dissemination products
• Third sub-process: Manage release of dissemination products
• Fourth sub-process: Promote dissemination products
• Fifth sub-process: Manage user support
Sub-processes and Activities in Phase seven: Disseminate Phase
7.1 First sub-process: Update output systems
This sub-process manages the update of systems (e.g. databases) where data and metadata are stored ready for dissemination purposes.
• Activities of the Process:
7.1.1 Preparing and loading of outputs to public dissemination databases, including:
- Formatting and preparing data and metadata ready to be put into public dissemination databases.
- Loading data and metadata into public dissemination databases.
- Ensuring data are linked to the relevant metadata.
Note: Formatting, loading and linking of metadata should preferably mostly take place in
earlier phases, but this sub-process includes a final check that all of the necessary
metadata are in place ready for dissemination.
7.2 Second sub-process: Produce dissemination products
This sub-process produces the dissemination products, as previously designed in sub-process
2.1 (Design outputs), to meet customers’ needs. Promotional martials are also prepared, and may include printed publications, press releases and websites. The products can take many forms including interactive graphics, tables, maps and public-use microdata sets.
• Activities of the Process:
7.2.1 Assembling regular products:
- Assembling products made available to beneficiaries and regularly released (e.g. data tables, time series, publications, special analyses, graphs, data presentations, etc.) of approved data, associated text, presentations, and technical documents derived from secure dissemination databases.
7.2.2 Assembling customized products:
- Assembling products specifically designed to meet the needs of specific customers (e.g. special tables and analyses, confidential files for unit records, etc.).
• Note: These products may be directed to customers for a fee except for international requests.
7.2.3 Preparing promotional materials:
- Preparing and carrying out promotional plans for products with a view to advertising.
- Considering the appropriate methods for each product according to its importance.
• Note: Promotional plans include study of target audience and access strategy.
7.2.4 Approving products in their final form for release and dissemination.
7.3 Third sub-process: Manage release of dissemination products
This sub-process ensures that all elements for the release are in place including managing the timing of the release. It includes briefings for specific groups such as the press or decision makers from relevant officials, facilitating access to statistical content in public dissemination databases through multiple dissemination channels for different beneficiaries on various devices, as well as the arrangements for any pre-release embargoes. It also includes managing access to confidential data by authorized user groups according to GASTAT access policies. The disseminated content is also archived in accordance with the archiving policy designed in Cross-cutting Process «9.3 data management».
• Activities of the Process:
7.3.1 Exporting data to public dissemination databases, including:
- Copying data from a protected internal storage server to a platform available to beneficiaries.
- Making data available for transfer, and accessing it through a set of visualizing tools for interactive data on the web site.
7.3.2 Notifying relevant stakeholders prior to release, including:
- Aligning outputs with stakeholders to lift data restrictions.
- Aligning with partner entities on some products to produce a joint release in some cases.
7.3.3 Releasing products and services to customers, including:
- Approving statistical content for public dissemination.
- Releasing statistical content through specified dissemination.
- Observing the statistical product dissemination date as previously posted on GASTAT website.
• Note: The submission templates include (reports and publications, presentations, participation in statistical product-related events).
7.3.4 Verify that content has been successfully archived and that the disseminated content is preserved.
7.4 Fourth sub-process: Promote dissemination products
This sub-process concerns the active promotion of the statistical products that are released and disseminated to reach various beneficiaries.
• Activities of the Process:
7.4.1 Using customer relationship management tools to better target potential users of
the products.
7.4.2 Using tools including websites, wikis and blogs to facilitate the process of communicating
statistical information to users.
7.5 Fifth sub-process: Manage user support
This sub-process includes collection and review of customers’ queries, service requests and requirements to be answered and met.
• Activities of the Process:
7.5.1 Receiving customers’ queries and questions, including:
- Receiving queries and questions from customers and beneficiaries of data and
information related to the products and services of GASTAT.
- Receiving requests for availability and access to accurate data.
- These queries and requests should be regularly reviewed to provide an input to the
overarching quality management process, as they can indicate new or changing
user needs.
- Monitoring interaction and measure customers’ feedback and submit them to the
relevant department to be used for improvements.
7.5.2 Responding to customers’ queries, including:
- Responding to customers’ queries shortly after the release (e.g., questions regarding
released statistics, availability of additional data, etc.).
- Recording queries to support the «Evaluate» phase.
- Handling subsequent queries, including requests for customized outputs, products
or services, through Customer Relationship & Support Department.
08 | Phase eight: Evaluate
This phase manages the evaluation of a specific instance of a statistical business process, as opposed to the more general overarching process of statistical quality management described in (Overarching Processes). It relies on inputs gathered throughout the production phases to be evaluated based on specific procedures leading to the analysis and identification of improvements required for product quality assurance.
It can take place at the end of the instance of the process, but can also be done on an ongoing basis during the statistical production process.
For statistical products produced regularly, this phase occurs in each iteration, including: Evaluation and development of improvements plan.
For statistical outputs produced regularly, evaluation should, at least in theory be considered as the decision point for post-evaluation procedures:
• Determining whether future iterations should take place.
• If so, whether any improvements should be implemented.
• Providing the decision as to whether the next iteration should start from the “Specify Needs” phase, or from some later phase (often the “Collect” phase).
This Phase consists of (3) sub-processes:
• First sub-process : Gather evaluation inputs
• Second sub-process : Conduct evaluation
• Third sub-process : Agree an action plan
Sub-processes and Activities in Phase eight: Evaluate Phase
8.1 First sub-process: Gather evaluation inputs
This sub-process aims to compile all forms of evaluation elements, including: (feedback from users, process metadata, system metrics, staff suggestions, and reports of progress against an action plan agreed during a previous iteration) and made them available to the team that will carry out the evaluation. For the evaluation of certain processes it can be necessary to perform specific activities such as small surveys, (e.g. post-enumeration surveys, re-interview studies, survey on effectiveness of dissemination).
• Activities of the Process:
8.1.1 Gather quantitative evaluation inputs:
- Gathering a set of quantitative information or measurable inputs to enhance the evaluation process.
- As metrics or measures relating to output quality, such as: Effects of imputation of missing values, revisions to relative standard errors, etc.).
8.1.2 Gather qualitative evaluation inputs:
- Collection of qualitative or textual information using a variety of methods (e.g.: Clients meetings, internal meetings to discuss outstanding issues, meetings, workshops, etc.).
- This information is collected from:
A. Production staff.
B. Customers.
C. Data providers.
D. Service providers.
• Note: The information is compiled (clerically), and some information collected during the project is added to it such as: Comments on output quality during review.
8.2 Second sub-process: Conduct evaluation
In this sub-process, evaluation inputs are analyzed and collected in the prepared report or dashboard. The evaluation report includes (action plan, any quality issues, recommendations for changes where appropriate). These recommendations can cover changes to any phase or sub-process for future iterations of the process, or can suggest that the process is not repeated. The action plan is then presented to the relevant body to set priorities and move forward.
The evaluation can take place at the end of the whole process (ex-post evaluation) for selected activities, during its execution in a continuous way, or throughout the process, thus allowing for quick fixes or continuous improvement. The resulting report should note any quality issues specific to this iteration of the statistical business process.
• Activities of the Process:
8.2.1 Analyze evaluation inputs, including:
- Accurate analysis of evaluation inputs collected in sub-process «8.1 gather evaluation inputs».
- Comparing the results of the analysis with the expected results of the project, to determine whether or not they have been achieved.
8.2.2 Identify improvements, including:
- Identifying a set of improvements or potential solutions to the problems identified by the analyses.
- Discussing identified improvements with those concerned to bring them before the authorized person.
• Note: Improvements can be made to processes, methodologies, systems, personnel skills, standards and statistical frameworks.
8.2.3 Provide data quality instructions to data providers, including:
- Providing data providers with information and resources with a view to improving data quality.
8.2.4 Measure customer satisfaction, including:
- Developing specific mechanisms to measure customer satisfaction and identify elements of satisfaction.
- Coordinating with customers to express their views about the product through direct contact or workshops.
- Measuring customer satisfaction with the statistical product.
- Dealing with dissatisfied customers and explaining the situation to them.
- Analyzing the results and submitting them to the relevant department for improvement.
- Measuring the satisfaction of the statistical community.
• Note: Customer satisfaction questionnaire can be carried out on several products or a specific product.
8.2.5 Measuring customer use of indicator:
- Measuring customer use of exported statistical indicators and the extent to which these indicators meet customers› objectives and requirements.
8.2.6 Prepare evaluation report, including:
- Documenting the results of the evaluation analyses.
- Preparing the evaluation report based on the results of documented analyses.
- Making recommendations for improvements.
• Note: The report includes: Recommended changes to various strategies and the data set collected during the survey, impact assessments, proposed changes to the issuance of lists and risk mitigation strategies, and an action plan indicating the appropriate timing, priority and responsibility to undertake all identified improvements.
8.3 Agree an action plan:
• Activities of the Process:
8.3.1 Based on the evaluation and improvement recommendations, a proposed action plan is presented to the authorized person for discussion, prioritization and approval.
• It is important to consider monitoring the impact of planned changes and improvements on the statistical process and its outcomes, as this may provide input for evaluating future iterations of the statistical process.
09 | Quality Management, Metadata Management & Data Management
In addition to the overarching processes applied during the eight production phases discussed above, this section will address other processes related to quality management, metadata management and data management.
In general, these processes aim to increase the possibility of completing the work on time and in accordance with the defined budget and quality levels required to mitigate risks that the product may face at any of the previous eight phases of production, and also strengthen the aspects of project management as one major group.
Overarching processes include:
9.1 Quality Management
• The main goal of quality management within the statistical business process is to understand and manage the quality of the statistical sources, processes and products. There is general agreement among statistical organizations that quality should be defined according to the ISO 9000-2015 standard: «The degree to which a set of inherent characteristics of an object fulfils requirements». Thus, quality is a complex and multi-faceted concept, usually defined in terms of several quality dimensions. The dimensions of quality that are considered most important depend on user perspectives, needs and priorities, which vary between processes and across groups of users.
• While quality management is linked to the «evaluate» phase, quality management should be present at all phases of other statistical work processes. It is necessary to evaluate procedures at each separate phase taking into account that evaluation is unlikely to be feasible for every iteration of every part of every statistical business process. Thus, there should be systematic way for evaluation according to a pre-determined schedule that allows for the review of all main parts of the process within a specified time period.
• By conducting these evaluations, a number of proposals (feedback) are formed, which
certainly include the metadata generated by the various sub-processes that are used as an input to process quality management. These evaluations can be applied within a specific process, or across several processes that use common components.
• Quality control procedures also play a key role in quality management, at the level that should be implemented within sub-processes to prevent and monitor errors and sources of risk. These procedures must be documented, so that they can be used for quality reporting.
• Quality management within GASTAT should be based on the specific quality framework adopted by senior management. Such a framework need not be fully compatible with the quality frameworks within other statistical organizations, as this reinforces the importance of cross-organizational benchmarking.
• «Quality loop» reinforces the approach to continuous improvements and organizational learning as the quality processes go through the elements of such loop as elaborated in the below figure:
Figure 3: Quality loop
• Examples of quality management activities include:
- Setting quality criteria at each phase of the project to be used in the process.
- Setting process quality targets and monitoring compliance.
- Examining process metadata and quality indicators.
- identifying and assessing risks, Including: financial risk, employee benefit risk, customer relationship risk, etc.,
- Implementing risk treatments to ensure fit-for-purpose quality.
- Monitoring project action plan and budget.
- Seeking and analyzing user feedback.
- Reviewing processes and documenting lessons learned.
- Internal or external auditing on the process.
- Preparing workflow reports.
- Adopting corrective actions.
9.2 Metadata Management
• Metadata has an important role and must be managed at an operational level within the statistical production process. When aspects of metadata management are considered at corporate or strategic level (e.g. there are metadata systems that impact large parts of the production system), it should be considered in the framework of the Generic Activity Model for Statistical Organizations (GAMSO).
• Because metadata exists at every phase of the statistical process, the overall management process of metadata focuses on the creation/revision, updating, use and archiving of statistical metadata.
• Ensuring that these metadata are captured as early as possible, and stored and transferred from phase to phase alongside the data they refer to. Therefore, metadata management strategy and systems are vital to the operation of this model, and these can be facilitated by the Generic Statistical Information Model (GSIM).
• GSIM is a reference framework of information objects, which enables generic descriptions of the definition, management and use of data and metadata throughout the statistical production process. GSIM supports a consistent approach to metadata, facilitating the primary role for metadata, that is, that metadata should uniquely and formally define the content and links between information objects and processes in the statistical information system.
• The METIS Common Metadata Framework identifies the following sixteen core principles for metadata management, all of which are intended to be covered in the overarching metadata management process, and taken into the consideration when designing and implementing a statistical metadata system.
| Metadata handling |
1. Statistical Business Process Model: Manage metadata with a focus on the overall statistical business process model. |
| Metadata Authority |
1. Registration: Ensure the registration process (workflow) associated with each metadata element is well documented so there is clear identification of ownership, approval status, date of operation, etc. |
|
|
1. Integrity: Make metadata-related work an integral part of business processes across the organization. |
| Users |
1. Identify users: Ensure that users are clearly identified for all metadata processes, and that all metadata capturing will create value for them. |
9.3 Data Management
• Data management is essential as data are produced within many of the activities in the statistical business process and are the key outputs. The main goal of data management is to ensure that data are appropriately used and usable throughout their lifecycle. Managing data throughout their lifecycle covers activities such as planning and evaluation of data management processes as well as establishing and implementing processes related to collection, organization, use, protection, preservation and disposal of the data.
• How data are managed will be closely linked to the use of the data, which in turn is linked to the statistical business process where the data are created. Both data and the processes in which they are created must be well defined in order to ensure proper data management.
• Examples of data management activities include
- Establishing a governance structure and assigning data stewardship responsibilities.
- Designing data structures and associated data sets, and the flow of data through the statistical business process.
- Identifying database (repositories) to store the data and administration of the database.
- Documenting the data (e.g. registering and inventorying data, classifying data according to content, retention or other required classification).
- Determining retention periods of data. and the risks that may encounter data archiving, and find solutions to overcome them, such as:
• Limited institutional data and metadata archiving and disposal bases (for example, deciding on which releases need to be retained).
• Difficulties that may face the archiving of administrative data collected through the project.
- Securing data against unauthorized access and use.
- Safeguarding data against technological change, physical media degradation, data corruption.
- Performing data integrity checks (e.g. periodic checks providing assurance about the accuracy and consistency of data over its entire lifecycle).
- Performing disposition activities once the retention period of the data is expired.
10 | Tenth: List of Acronyms
| GSBPM | Generic Statistical Business Process Model: A flexible tool to describe and define the set of business processes needed to produce official statistics. |
| GAMSO | Generic Activity Model for Statistical Organizations: A reference framework describing and defining the activities that take place within a typical statistical organization. |
| GSIM | Generic Statistical Information Model: A reference framework of information objects, which enables generic descriptions of the definition, management and use of data and metadata throughout the statistical production process. |
| GPS | Global Positioning System |
| HL-G-MOS | High-Level Group for the Modernization of Official Statistics. |
| ICT | Information and Communication Technology (ICT) |
| UNECE | United Nations Economic Commission for Europe. |
| CAPI | Computer-assisted personal interviews |
| CATI | Computer-assisted telephone interviews. |
| CAWI | Computer-assisted Web Interviews. |