Historians: Net-Producer

What Does This Article Cover?

HighByte Intelligence Hub can be used to create a solution that obtains process values from a historian software, applies context, and writes a contextualized payload to the data lake. There are many variations for this type of solution and the two that are covered in this article are obtaining order context from an external source and defining asset hierarchy context using Intelligence Hub Namespaces.

Solution Assumptions - Order Context

The following summarizes the assumptions related to the example solution.

The solution scope included thousands of PI Points.
The frequency of value changes varied per PI Point, with thousands of value changes per minute for the in-scope PI Points.
The value change data was obtained in the context AVEVA PI Asset Framework Assets.
The solution requires unique PI Asset identifiers to be linked to a current manufacturing order. These associations were made in a Microsoft SQL Server database table.
AVEVA PI System was installed on an Amazon EC2 instance. The Intelligence Hub PI Connection agent was installed on the same EC2 instance.
Intelligence Hub was installed on a second Amazon EC2 instance with 8 GB of RAM. The heap memory allocated to the Java Virtual Machine (JVM) was not adjusted.
The Intelligence Hub Pipeline wrote data to a .CSV file that was stored in Amazon S3. Each .CSV file contained 50,000 records.

Solution Summary - Order Context

The following summarizes the design of the example solution.

The in-scope PI Asset Framework Assets were determined based on Asset Template. All Assets derived from a specific Asset Template were considered in-scope for the starter solution.
The PI Connection Asset Changes Type Input was used to subscribe to obtain all value changes for all PI Attributes associated with the Assets instantiated from the Asset Template. The Connection Input provides metadata, including the Asset's PI element identifier. The Connection Input returned the data in less than one second when queried every 30 seconds.
An Intelligence Hub Model provides an opportunity to structure the payload written to the destination system. The schema included unique identifiers for PI metadata as well as the context, in this case the manufacturing order. Consider how to handle values. For example, all values could be converted to strings or there could be a column for numeric values and a second column for non-numeric values.
The Intelligence Hub Pipeline design consists of breaking up the large dataset returned by the Connection Input, modeling the data, creating a file, and writing the file to the Data Lake. The Pipeline's polling trigger should be optimized for the duration of the Connection Input, the duration of the Pipeline execution, considerations for a possible non-uniform flow of data, and a safety factor. The number of records that are buffered and written to the file should consider Data Lake processing and required latency.
The solution needs to look up the manufacturing order for the given PI Asset element identifier. It might not be prudent to query the source Microsoft SQL Server database for each value change. For this reason, the solution replicates the contents of the Microsoft SQL Server database to an in-memory SQLite table. The Pipeline queries the SQLite table to obtain the manufacturing order give the PI element identifier.
When optimizing the Pipeline consider the volume of data being processed. It might not be possible to use the Replay capabilities due to the volume of data being processed.
A project file may be downloaded [here]. The included project file is the same one used in the video below. You can download it and import it into a runtime version 4.3 or later to explore and experiment.

Results and Recommendations - Order Context

The following summarizes the results of running the example solution.

The Pipeline was configured with a Polled Trigger that run every 30 seconds. On average the Pipeline executed in a few seconds.
The Pipeline processed about 4,000 value changes per minute.
The example solution did not consider error handling in the Pipeline for missed reads of order data or an unexpected flood of PI value changes for example.

Solution Assumptions - Asset Context

The following summarizes the assumptions related to the example solution.

The solution scope included one hundred assets defined in Intelligence Hub Namespaces, an Intelligence Hub Model for the assets with ten attributes, and one thousand PI Points associated to the attributes.
The Intelligence Hub Pipeline polled PI Data Archive for values for each of the PI Points every five seconds.
The solution required PI Point Names to be associated to a unique hierarchical asset path. These associations were made using Intelligence Hub Namespaces.
AVEVA PI System was installed on an Amazon EC2 instance. The Intelligence Hub PI Connection agent was installed on the same EC2 instance.
Intelligence Hub was installed on a second Amazon EC2 instance with 8 GB of RAM. The heap memory allocated to the Java Virtual Machine (JVM) was not adjusted.
The Intelligence Hub Pipeline wrote data to a .CSV file that was stored in Amazon S3. Each .CSV file contained 10,000 records.

Solution Summary - Asset Context

The following summarizes the design of the example solution.

One hundred asset were defined with a hierarchy using Intelligence Hub Namespaces.
An Intelligence Hub Model was defined with ten attributes. Models might be defined per asset type. The Model Attributes provide an opportunity for defining standard names across PI Points that might not have uniform or descriptive names.
An Instance was created from the Model and parameters were defined for each Attribute.
The Instances was associated to Namespaces nodes. PI Point Names were associated to the proper asset hierarchy and attribute.
The intelligence Hub Pipeline was configured to run every five seconds. The Pipeline uses Smart Query to obtain the asset hierarchy and mapped PI Point Names. The Pipeline breaks up the payload to obtain each PI Point Name with its respective asset path as context. The Pipeline reads the last value and timestamp for each PI Point. Finally, the Pipeline creates a .CSV file that is written to Amazon S3.
When optimizing the Pipeline consider the volume of data being processed. It might not be possible to use the Replay capabilities due to the volume of data being processed.
A project file may be downloaded [here]. The included project file is the same one used in the video below. You can download it and import it into a runtime version 4.3 or later to explore and experiment.

Results and Recommendations - Asset Context

The following summarizes the results of running the example solution.

The Pipeline was configured with a Polled Trigger that run every five seconds. On average the Pipeline executed in about three seconds.
The Pipeline processed about 1,000 value changes per run.
The example solution did not consider error handling in the Pipeline for missed reads of PI data for example.

Additional Resources