data management with scidata

43
Data Management with SciData Brad Carman February 21, 2013 How to Keep Organized and Automate Your Data Analysis and Reporting 1

Upload: asoboy

Post on 16-Jan-2016

11 views

Category:

Documents


0 download

DESCRIPTION

Data Management With SciData

TRANSCRIPT

Page 1: Data Management With SciData

Data Management with SciData

Brad Carman

February 21, 2013

How to Keep Organized and Automate Your Data Analysis and Reporting

1

Page 2: Data Management With SciData

Contents

1 Example Scenario: Spring Testing with a Compression Test 31.1 How the Sample Data is Organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Quick Overview of SciData 42.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Communication with Several Mathematical Programs . . . . . . . . . . . . . . . . . . . . . . 5

2.2.1 Scilab - Free . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2.2 MathCAD - Easy/Intuitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2.3 MATLAB - Professional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Data and Math are Separate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4 Column Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.5 Row Operations vs. Table Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.5.1 Row-by-Row Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.5.2 Table Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Importing Data 113.1 Start a New SciData Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Copying Data Into the Data Folder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.3 The Import Button . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 Scientific Dataset Data File 144.1 Test Data File Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.2 Using the Data Import Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.3 Running the ‘DataImport.sce’ Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5 Organizing the Data 185.1 Column Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195.2 Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205.3 Applying Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

6 Adding Row-by-Row Calculations 226.1 Calculating the Spring Rate, k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

6.1.1 Calculation Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.1.2 Tagging Variables for SciData Import . . . . . . . . . . . . . . . . . . . . . . . . . . . 286.1.3 Filtering Data in Scilab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306.1.4 Advanced Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

7 Table Operations 347.1 Plotting Springs k Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

7.1.1 Using AutoPlot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357.1.2 Saving an AutoPlot Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367.1.3 Editing an AutoPlot Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

8 Summary 39

9 Appendix 39

2

Page 3: Data Management With SciData

1 Example Scenario: Spring Testing with a Compression Test

As a way to demonstrate the benefits of Data Management, an example scenario will be applied where abatch of springs will be tested. The batch consists of New and Used springs and the New springs were testedat both room temperature and 70◦C.

(a) Compression Test with Environmental Chamber

25◦C 70◦CNew Springs • •Used Springs •

(b) Test Conditions

1.1 How the Sample Data is Organized

As the data is collected it is stored into folders with the spring name/number and put into a subsequentfolder representing the test temperature, as shown in Fig. 1. Also note that the New springs are numbered1-10 and the Used springs are numbered like 00##. Note, this is the type of information that is easily lostas time passes. Clear documentation of your data is key to Data Management, and is therefore one of themain goals of SciData.

Figure 1: Organization of Data

3

Page 4: Data Management With SciData

2 Quick Overview of SciData

Data Management will be achieved in this example using the program SciData (Fig. 2). This program servesas a flexible database for organization of almost any type of data. The following sections detail the benefitsof using this program, especially in contrast with a spreadsheet application, such as Excel.

Figure 2: SciData Screenshot

2.1 Installation

SciData is found at:

sourceforge.net/projects/scidata

Notes:

1. Scilab must be installed first before running the setup.exe file for SciData. Please ensure that thecorrect version of Scilab is installed, i.e. if youre running a 64-bit machine, you must install the 64-bitversion of Scilab!

2. Do not download the DAQ Flex installer into the same folder as setup.exe, there is an error thattriggers the wrong executable.

Installation Steps:

1. Install Prerequisites (scilab.org)

2. Install SciData (sourceforge.net/projects/scidata)

Software:

• Prerequisites

– Scilab

• Optional

– DAQ FlexNote: DAQ Flex is used to provide integrated data acquisition. SciData DAQ operation workswith any DAQ Flex supported products ( http://www.mccdaq.com/daq-software/DAQFlex.aspx).Can be downloaded at ftp://ftp.mccdaq.com/downloads/DAQFlex.

4

Page 5: Data Management With SciData

– MathCAD (Supports v14-15)

– MATLAB

– SDS: Scientific Dataset library and toolsNote: The latest version (1.3) offers an updated dataset viewer (http://research.microsoft.com/en-us/projects/sds/)

2.2 Communication with Several Mathematical Programs

One of the challenges of data management is the difficulty of consuming the data in an application for furtheranalysis, reporting, etc. By default SciData communicates with Scilab, a free open source mathematicalsoftware which can perform a great majority of the tasks required for most scientific/statistical needs. Thecommunication with Scilab is done with the click of a button in SciData. Therefore there is no need tosave and structure files, manipulate data, or manually copy/paste to exchange data to and from Scilab. Inaddition to Scilab, SciData also works with MathCAD and MATLAB. A quick overview of the differentmathematical packages is given below.

2.2.1 Scilab - Free

Scilab (Fig. 3) is a console based mathematical software which is modeled after MATLAB. Commands canbe typed one by one into the console window, or they can be stored in a script (*.sce file) and executed lineby line. To get started with Scilab, the help menu contains a tutorial. This current tutorial on SciData willalso use Scilab exclusively and will help you get started.

Notes:- Scilab is required to be installed in order to use SciData.- Scilab can be downloaded for free from www.scilab.com.

Figure 3: Scilab 5.4.0

2.2.2 MathCAD - Easy/Intuitive

MathCAD (Fig. 4) is a whiteboard style application that displays math in it’s natural form. From theMathCAD website:

PTC Mathcad combines the ease and familiarity of an engineering notebook with the powerfulfeatures of dedicated engineering calculations application. Its document-centric, WYSIWYG

5

Page 6: Data Management With SciData

interface gives you the ability to solve, share, re-use and analyze your calculations without havingto learn a new programming language. The result? You can spend more time engineering andless time documenting.

Notes:- MathCAD is not free unfortunately, but is priced well.- SciData does not work with MathCAD Prime. Instead you will need to use MathCAD v14 or v15.- MathCAD is not required to use SciData.

Figure 4: MathCAD v14

2.2.3 MATLAB - Professional

MATLAB is fundamentally very similar to Scilab, but unlike Scilab, it is not free and open source, whichhas the benefits of a professional software package with good documentation and support. More simply put,MATLAB is more powerful and robust.

Notes:- MATLAB is not free and can be expensive depending on the need.- The MATLAB and Scilab languages are very similar. Scilab can open MATLAB files (scripts and datafiles).- MATLAB is not required to use SciData.

6

Page 7: Data Management With SciData

Figure 5: MATLAB 2012b

2.3 Data and Math are Separate

Another benefit of SciData comes form the fact that Data and Math are separated. This allows for a singlemath script to be applied across all the datasets in a collection (a set of experimental tests for example) ora filtered subset of datasets. In contrast, Excel requires formulas to be copied for each additional set of dataincluded in a workbook. In SciData, edits, fixes, changes, etc. are made in one source, in Excel, a simplechange to the analysis could require many formulas to be edited, which can be difficult to track and oftenleads to mistakes.

2.4 Column Types

The Database Table in SciData contains several different column types. The below table (Tbl. 1) describestheir differences. Columns are either Locked, Input, or Result:

Locked Cells cannot be edited. For example ID, this is controlled by SciData

Input Cells which can be edited and are used to describe the row, such as a category or number explaininga variable test condition

Result Cells which collect information generated form an analysis

7

Page 8: Data Management With SciData

Table 1: Column Types

Icon Name Data Type Example Locked, Input, or Result

Default Text (ID is Number) Name, Date, Notes Input (ID is Locked)

Folder TextData\Springs\Test1→ F1 = Springs, F2 = Test1

Locked

Category Text Type=New Input

Constant Number x=1 Input

Result Number k=2 Result

Text Result Text abc Result

Array Array x = [1;2;3;4;5] Input & Result

2.5 Row Operations vs. Table Operations

The fundamental purpose of SciData is to easily send data to memory form the source database. Thereforeyou easily filter the database to send the data of interest, but you can also send the data in two differentmodes:

• Row-by-Row

• Table

The SciData Database Table contains several different types of information: numbers, arrays, and text. Whenin Row-by-Row mode, the information is sent as is. When in Table mode, the information is sent stacked asshown in Fig. 6. Numbers and text (items that are in single units) are stack into arrays and informationstarting as arrays are then stacked side by side into a matrix.

Figure 6: How Table Data is Sent

In SciData, the last two tabs in the toolbar are Row and Table. These are then for Row-by-Row operationsor for Table operations. We can now explore how these two modes work.

8

Page 9: Data Management With SciData

2.5.1 Row-by-Row Mode

The Row-by-Row mode is made up of 3 steps:

Step 1 Send data to memory (a row in the Database Table represents a data file which can contain numbers,text, or arrays)

Step 2 A script (Scilab, MathCAD, or MATLAB) is executed using the data in memory

Step 3 Results are retrieved by SciData and stored in theDatabas Table and in the respective SDS data file

These steps can be visualized by Fig. 7. Each row represents a whole data file, whatever that file containsis sent to memory. Then the script selected is executed using the data in memory. Any results that aregenerated can then be retrieved and stored in SciData. It will be discussed later how to tell SciData whatresults to retrieve.

Figure 7: Data is sent row by row to a single math script

The Row tab offers different methods for Row-by-Row memory operations. As can be seen (Fig. 8), there

are the buttons Send Row [1], Process Row [2], and All Filtered Rows [3]. When using Send Rowthe information from the currently selected row is sent to memory of the designated program. By default,Scilab is the designated program and always runs in the background of SciData, but if a different scriptfile is created and selected [4], then this will designate which of the three programs (Scilab, MathCAD, or

MATLAB) to target. The Process Row button will also target the selected script file, but it will run

through all 3 steps. Finally, the All Filtered Rows will batch process the Database Table (in it’s filteredstate), processing each row one-by-one.

9

Page 10: Data Management With SciData

Figure 8: Buttons of the Row Tab

Fig. 9 illustrates an example of the Row-by-Row mode. By moving to the Row tab [1], selecting a row [2],

and clicking Send Row [3], data is sent to Scilab’s memory. Note that all of the values sent to memoryfrom the Database Table turn pink. We can see in the Console Window [9] that the memory was first cleared[4b] (controlled by the ‘Clear Memory Pre Calc’ option [4a]) followed by row ID 1 loading to memory [5].We can check the information in memory by executing some Scilab commands using the Command Box [8].By typing ‘Name’ into the Command Box, we can see it’s value [6]. Additionally, we can use the size()

function on variable Load to see that it’s an array of 626 rows [7].

Figure 9: Sending a Row from SciData

2.5.2 Table Mode

The Table tab is shown in Fig. 10, which offers 4 different buttons for memory operations. The table below

the buttons shows their different functions. As can be seen, buttons Send Lite and Process Lite do notsend arrays. This is to conserve memory if needed. In some cases a large Database Table with large arrayscan easily consume all the available memory, therefore if they are not needed, the ‘Lite’ option is available.Also, similar to the Row tab, the ‘Process’ buttons both send data to memory and execute the target script,but notice that Step 3 is not available in this mode. Any results generated from the Table mode need to besaved using other means.

10

Page 11: Data Management With SciData

Figure 10: Buttons of the Table Tab

Fig. 11 illustrates an example of the Table mode. The Database Table is first filtered [1] with ID= 1 & 2.

After clicking Send Full [2] the table will turn pink [3] indicating which values have been loaded to memory.We can then check how the variables exist in memory using the Command Box. First type ‘Name’ to see thatit contains an array of text values [4]. Second using the size() function again, we check the Load variableto see it is an array with 626 rows and 2 columns, representing each row.

Figure 11: Sending the Filtered Database Table from SciData

3 Importing Data

Now that the basic operation of SciData is understood, the example analysis with the spring data will begin.The first step to analyzing data is to import it.

3.1 Start a New SciData Project

Before importing data we must create a project. Simply start SciData (Start >SciData), which brings upthe start screen (Fig. 12). Choose ‘Start a New File’. Save the file ‘Spring Analysis.sdat’.

11

Page 12: Data Management With SciData

Figure 12: SciData Start Screen

Understand that the structure of SciData is as follows (referring to Fig. 13):

Project Folder [7] Root folder that contains the SciData file (*.sdat) along with the Data and Mathfolders. The SciData file must always be shadowed by a Math and Data folder (which are createdautomatically).

SciData File [1] The *.sdat file (contains database structure information)

Data Folder [6] Contains the data files that SciData manages

Math Folder [5] Contains the scripts (either Scilab, MathCAD, or MATLAB files)

Row Folder [3] Row-by-Row scripts, intended to process a single row of data

Table Folder [2] Table scripts, intended to process many rows of data

DAQ Folder [4] Scripts for Data Acquisition

Figure 13: SciData File Structure

3.2 Copying Data Into the Data Folder

The first way to import data to SciData for this example is to copy the contents of the Spring folder (Fig. 1)

into the Data folder (note the Open Data Folder Fig. 14 [5] is available to quickly navigate there). When

the copy operation is complete, click the Scan Data Folder [2] button under the Data tab [1]. As can beseen in [3], 28 rows are imported. With each row that is imported a note is shown in the Console Windowthat reads:

>> CSV file is not a SDS Standard file, renaming to ... *.dat

When the Scan Data Folder button is used, SciData searches the Data folder for the following:

Scientific Dataset (SDS) Files A data file standard developed by Microsoft Research

*.csv Comma separated text file

*.nc Binary (NetCFD)

12

Page 13: Data Management With SciData

*.sod Files Binary (Scilab)

*.dat Files Standard data files

User Defined Files Any additional extension added in the ‘Ext:’ text box [6]

Folders If ’Mark Folders’ is checked, SciData will add a data file for all folders found that do not alreadycontain a data file. Useful for adding data grouped by folders in several different files.

Figure 14: Scaning the Data Folder

If a *.csv file is found, SciData attempts to read the file as a SDS file. If it fails, the *.csv file is copied to a*.dat file. A clean shadow SDS file is then created. Therefore, the Data folder ends up with both *.dat and*.csv files as shown below in Fig. 15. At this point, the *.csv file is empty, in the next step the file will bepopulated from the original source.

Figure 15: Shadow File

Note, at any time, additional data can be added to the Data folder. To keep the Database Table in sync with

changes made (either adding or removing), simply run the Scan Data Folder command again.

3.3 The Import Button

It is also possible to use the Import Data button (Fig. 16 [1]). In this case a dialog is presented [2] to copy

selected files and folders into the Data folder followed by an automatic scan.

13

Page 14: Data Management With SciData

Figure 16: Import Data Dialog

4 Scientific Dataset Data File

As mentioned, SciData stores information using the Scientific Dataset (SDS) standard developed by MicrosoftResearch. Storing data using an established standard has the benefit of making it more easy to share andaccess. Furthermore, if used correctly, each data file should be self descriptive and contain all the importantrelated data. For our example, it will be important to store the two main conditions: ‘temperature’ and‘spring condition’ (new vs. used). The SDS format allows for this. Furthermore, it should be noted that theSDS package contains a viewer tool (Fig. 17) which could be useful for sharing your data.

Figure 17: SDS Viewer

14

Page 15: Data Management With SciData

4.1 Test Data File Structure

The data saved from the experiment is structured as shown below (Fig. 18a). If we were using Excel toanalyze this data, we would need to import each file and define all the data ranges, which is time consumingsince each data file is a little different in length (not to mention all this would need to be done manually).In contrast, SciData will automate this process and convert all these data files to the SDS format, as shownin Fig. 18b. Note that the ‘Single Values’ area contains the extra metadata to make the file self descriptive.We can see the test temperature is 25◦C and the spring condition or ‘Type’ was Used.

(a) Instron Data File (b) SDS Data File

Figure 18: Data File Structures

4.2 Using the Data Import Wizard

To properly import the data from the *.dat file to the SDS *.csv file, a wizard is provided to make thisprocess much easier. The following figures walk through the steps.

Step 1 First go to the Data tab (Fig. 20 [1]) and click the Build an Import Script button [2].

Step 2 Next browse [3] for an example data file. Choose any of the converted data files. The files are‘comma’ delimited, but note this can be changed (Fig. 21a [4]).

Step 3 Now, to import the Time column, click the first cell of the data column [6] and then click OK (Fig.21b [7]).

Step 4 Specify as a data column [8], name it [9], add the appropriate unit [10] (optional), and click Add(Fig. 22a [11]).

Step 5 Notice a line is added to the code window (Fig. 22b [12]).

Repeat 3-5 Add the Load and Ext columns by following the same process.

Save Finally, click the save button. A script named ‘DataImport.sce’ is saved to the ‘Math\Row’ folder

15

Page 16: Data Management With SciData

When completed, the script generated should look like shown in Fig. 19.

1 i f a t om s I s I n s t a l l e d ( ’ c s v r e a dw r i t e ’ )23 f i l eName = data pa th + ’\ ’ ; i = 1 ;4 wh i l e e x i s t s ( ’F ’ + s t r i n g ( i ) )5 f i l eName = f i l eName + e v s t r ( ’F ’ + s t r i n g ( i ) ) + ’\ ’ ; i=i +1;6 end7 f i l eName = f i l eName + Name ;8 f n = mopen ( f i l eName , ’ r t ’ ) ;9 dataText = mget l ( fn ) ;

10 mclose ( fn ) ;1112 e l s e13 d i s p ( ’ c s v r e a dw r i t e i s not i n s t a l l e d , i n s t a l l i n g now , p l e a s e wa i t . . . ’ )14 a t om s I n s t a l l ( ’ c s v r e a dw r i t e ’ )1516 end171819 // Import Va lue s . . .20 Time = f i n d v a l u e (9 , ’ column ’ ,1 , ’ , ’ ) ; // [ s e c ] [ @ ]21 Load = f i n d v a l u e (9 , ’ column ’ ,2 , ’ , ’ ) ; // [N ] [ @ ]22 Ext = f i n d v a l u e (9 , ’ column ’ ,3 , ’ , ’ ) ; // [mm] [ @]

Figure 19: Import Script Code

Figure 20: Data Import Script Wizard (Part A)

16

Page 17: Data Management With SciData

(a) Part B (b) Part C

Figure 21: Data Import Wizard

(a) Part D (b) Part E

Figure 22: Data Import Script Wizard

4.3 Running the ‘DataImport.sce’ Script

After hitting Save (Fig. 22b [13]) on the Import Script Wizard, SciData will add the necessary columns(Fig. 23 [1]) and display the message [2] to run the script [4]. Moving to the Row tab [3], you can see that

the ‘DataImport.sce’ file is selected [4]. We simply need to hit the All Filtered Rows [5] button and all thefiltered rows in the Database Table will be batch processed with the selected script, one-by-one.

17

Page 18: Data Management With SciData

Figure 23: After Using the Import Script Wizard

Fig. 24 shows the result after batch processing the script across all the rows. The array variables Time, Load,and Ext are populated and what is shown [1] is the number of points in the array. Note that the arrays are

currently only populated in memory, as indicated by the disk icons [2]. After clicking the Save button [3],the respective SDS data files are populated on disk.

Figure 24: After Running ‘DataImport.sce’

5 Organizing the Data

Now that our data is prepared, it is important that we add some descriptive information. This is helpful fordefining what the data is for future reference, but also it helps us sort and filter the data we want.

18

Page 19: Data Management With SciData

5.1 Column Notes

There are several spots to add notes to add extra useful information. There is a default Notes column toallow for row specific notes, such as describing which experimental runs had anomalies, etc. The secondplace to add notes is to the columns themselves to describe what they mean. This can be done using theColumn Editor (Fig. 25). All of the columns can have notes added. When using folders in the Data folder, asa best practice, its a good idea to give notes to what each folder represents. The first folder in the presentexample represents the environmental test temperature, the second folder represents the spring designation.

Therefore, click Edit Columns [1], select Folder [2] and add notes [3] to describe F1 and F2.

Figure 25: Column Editor

The notes given to the columns shows up when the mouse is hovered over the column header (Fig. 26 [1]),or when looking at the column list [2-3].

19

Page 20: Data Management With SciData

Figure 26: Column Notes

5.2 Categories

It is also possible to add additional columns to describe the data. For the present example, the dataset doesnot yet describe which tests were done on New springs and which were Old springs. Therefore, we can add

a category Type as shown in Fig. 27. First, select Category [1] from the column type list. Then type in

the category name ‘Type’ [2]. Click Add [3] and the column will be added to the list. Category values can

be added for later selection using the text box [4]. Type a value and then click Add [5]. Add New and Oldto the list [6].

Figure 27: Categories

20

Page 21: Data Management With SciData

To apply the category values, the rows are first sorted by the date column (Fig. 28 [1]), since it is known thatthe New springs were tested first, and the Used last. It is also known that the New spring were numbered1-10, therefore, the last rows after 10 must be the Used spring. These rows are selected [2] and the Usedvalue is set with a right mouse click [3-4].

Figure 28: Setting Category Values

5.3 Applying Filters

To apply a filter, click the filter button either from the column header (Fig. 29 [1]) or in the column list [3].Note, to get to the Column List, click the button [2] to expand the control. The filter editor for the particularcolumn shows at the bottom right [4]. To filter the New springs, simply check the appropriate box [5] and

click OK [6].

21

Page 22: Data Management With SciData

Figure 29: Applying a Filter

Once a filter is applied, it will show above the Database Table (Fig. 29 [1]). Note that it will then only showthe rows meeting the filter criteria [2]. The filter can be edited by clicking the filter button to show the filtereditor [3].

Figure 30: Applied Filter

6 Adding Row-by-Row Calculations

SciData allows you to process each row in the data table individually, or the whole Database Table at once.In our case we need to calculate the spring rate for each test, which would be a Row-by-Row process.

22

Page 23: Data Management With SciData

6.1 Calculating the Spring Rate, k

The goal of the present example is to compare the spring rate, so we need to calculate this value for eachof the datasets. First, just focus on a single dataset. Plotting the data is a good first step, so click the first

row in the table and click Send Row .

6.1.1 Calculation Script

Previously we used a row-by-row script to import the data. This script was generated automatically so wedid not need to write any code. In this step we will add another script file and write the Scilab code manually

to calculate the spring rate. The first step is to add a new Scilab file by clicking the Add New File (Fig. 31

[1]) button and choosing Add Scilab File [2]. Choose the file name ‘k Calculation’ [3] and click OK [4].

Figure 31: Adding a Scilab Script

As can be seen in Fig. 32 a script is added [1] to SciData. Included is a list of the available variables fromSciData [2].

23

Page 24: Data Management With SciData

Figure 32: Scilab Script Editor

Before the script is applied, Scilab must be loaded with data. We can select the first row in the table and

click Send Row (Fig. 33[1]). As can be seen [2], the row will turn pink to indicate it is loaded in memory.This is also reflected in the Console Window [3].

Figure 33: Sending Data to Memory

To test the script, we can simply add the line plot(Ext, Load) (Fig. 34 [1]). If we execute the file byclicking the Execute button [2] (after saving the file), a plot of Load vs. Ext will appear [3].

24

Page 25: Data Management With SciData

Figure 34: Testing the Script

We can continue to edit the file now in SciData, but we also have the option to open the file in Scilab for abetter editing experience. By editing in Scilab we can see more information about errors that occur as well

as a more detailed code editor. By clicking the Open Externally button (Fig. 35 [1]), Scilab will open with

the current memory state. Note, it is possible to launch many Scilab instances with different memory states.Be careful to keep track of this. Best practice is to close Scilab when you are finished editing a script. Theexternal Scilab will not affect the memory of SciData.

Figure 35: Opening Scilab Editor

Scilab 5.4.0 offers a docked environment which shows the Console (Fig. 36 [1]), Script Editor [2] (note theimproved syntax highlighting, including coloring variables red [3]), and the Variable Browser [4]. As can be

25

Page 26: Data Management With SciData

seen in the Variable Browser, Scilab is indeed loaded with the current memory state from SciData.

Figure 36: Scilab Loaded From SciData

We now jump into writing a script that can calculate the spring rate. Fig. 37 shows the appropriate methodto achieve this. We will go through this script now to explain what is being done. The strategy to calculatethe spring rate is to apply a least squares curve fit to the data of the form

y = x ∗ k + i

Where -y - represents the Load [N]x - represents the Spring displacement [m]k - is the spring constant (or the slope of Load vs. Displacement) [N/m]i - the intercept (acts as an error indicator, since the intercept should be zero) [N]

Scilab has the ability to do a least squares fit to any function, which is great, but the down side is it’s notvery user friendly. The code below in Fig. 37 provides a clean example of how to apply the least squares fit.The steps to set this up are as follows:

1. Create a function to apply the curve fit. For this example we name the function ‘line’. The functionshould have the arguments x, coeffs, and params [lines 1-8].- x describes the independent variable.- coeffs is an array of size 1 to n representing the coefficients to be solved- params is an array for extra parameters used by the function. In the current example this is not used.

2. Create an error function ‘err’ which calculates the difference between the line() function and the data[lines 10-12].

3. Prepare the inputs x data and y data by setting them to Ext and Load, respectively. We must alsoprepare guess values, which often can be set to zeros. In our case we are solving for 2 coefficients, sowe set guess-coeffs to [0;0]. In cases where the leastsq() fails to solve, better guess values must besupplied [lines 14-17].- Note: To find a good guess values and/or a good fit function, try www.zunzun.com.

26

Page 27: Data Management With SciData

4. Use the leastsq() to calculate the optimal coefficients to the line() function. Note this functionsolves for 3 variables, but we are only concerned with coeff opt which holds the optimal solvedcoefficients [line 19].

5. The spring coefficient and intercept are extracted from coeff opt [lines 21-22].

6. Finally, the line() is plotted against the data to check the result [lines 24-27].

1 f u n c t i o n y=l i n e ( x , c o e f f s , params )23 i = c o e f f s (1 ) ; // i n t e r c e p t4 k = c o e f f s (2 ) ; // s l o p e56 y = x∗k+i ;78 end f un c t i o n9

10 f u n c t i o n y=e r r ( c o e f f s , x data , y data , params )11 y=l i n e ( x data , c o e f f s , params ) − y da ta ;12 end f un c t i o n1314 x da ta = Ext ;15 y da ta = Load ;16 params = [ ] ;17 g u e s s c o e f f s = [ 0 ; 0 ] ;1819 [ f , c o e f f o p t , g ] = l e a s t s q ( l i s t ( e r r , x data , y data , params ) , g u e s s c o e f f s )2021 k = c o e f f o p t (2 ) ; // s o l v e d s p r i n g con s t an t22 i = c o e f f o p t (1 ) ; // s o l v e d i n t e r c e p t2324 y t e s t = l i n e ( x data , [ i ; k ] , [ ] ) ;2526 p l o t ( x data , [ y da t a y t e s t ] )27 l e g end ( [ ” data ” ; ” fun ” ] )

Figure 37: Calculating the Spring Constant, k

The result from the solved spring coefficient, k, is shown in Fig. 38. As can be seen, a good curve fit isfound. So now what we’d like to do is store the spring coefficient value in SciData so it can be calculatedfor all the datasets.

27

Page 28: Data Management With SciData

Figure 38: Least Squares Fit Script Result

6.1.2 Tagging Variables for SciData Import

In the Row-by-Row mode of SciData, data can be not only sent to Scilab/MathCAD/MATLAB, but it canalso be retrieved and stored. Three types of information can be retrieved:

Single Numbers This type is tagged in Scilab with #

Array This type is tagged in Scilab with @

Non Numeric (Strings) This type is tagged in Scilab with $

After SciData runs a script, it then looks for results to be retrieved. There are 3 different result columntypes to match the types listed above with the respective icons. These columns can either be added manuallyusing the Column Editor, or they can be added automatically by tagging variables in the script.

In the current example, we need to store the spring coefficient, k, which is a Single Number, so we tagthe variable with #. Tagging in Scilab is done with a comment after the variable, structured as shown inFig. 39.

1 v a r i a b l e = . . . ; // [ o p t i o n a l u n i t ] [# ]

Figure 39: Variable Tagging Structure

Therefore, the code is modified as shown in Fig. 40 [1]. After this is done, clicking the Scan File button[2] will import k and i to SciData. This will add the columns to the Database Table [3]. Information aboutthe scanning process will also be shown in the log window [4].

28

Page 29: Data Management With SciData

Figure 40: Scanning for Tagged Variables

Now that the columns are added to SciData, we can select the first row (Fig. 41 [1]) and click the Process Rowbutton [2]. As can be seen, the columns k and i will be populated [3 & 4].

Figure 41: Processing a Row

We can now click the All Filtered Rows button (Fig. 42 [1]). As can be seen, the whole filtered DatabaseTable will be processed [2]. In this way, SciData is always structured and ready to batch process youranalysis. There are many benefits that come from batch processing an analysis:

• Fast and Efficient

• Automatic, for more time consuming analysis files, the computer can continue to process the datawithout needing any manual input

29

Page 30: Data Management With SciData

• Ensures the same consistent analysis across all the data. Furthermore, changes to the analysis canthen easily be refreshed consistently across all the data.

Figure 42: Processing all Filtered Rows

6.1.3 Filtering Data in Scilab

We can see from each of the data files that the ends of the Ext vs. Load are not very linear (Fig. 43). Ifwe are after the general spring constant and want to ignore the effects at the ends of the curve, then we caneasily filter the data first. What we will do then is modify our ‘k Calculation’ script file to first filter thearrays Ext and Load so that the first and last 2mm are removed. Then the curve fit will be applied to thefiltered arrays. Furthermore, for later use, the filtered arrays will be stored in SciData.

Figure 43: Ext vs. Load

To accomplish this, the ‘k Calculation’ script is modified as shown in Fig. 44 [1]. As can be seen, line 14stores the filtered rows in variable rws using the convenient Scilab/MATLAB function find(). Then therows of rws are extracted from Ext and Load as shown in lines 15 and 16. Finally, the variables x data andy data are updated with the new filtered arrays. We then tag the new filtered arrays as shown in [2]. After

30

Page 31: Data Management With SciData

the changes are made the file must be saved [3]. After clicking Scan File , the new array columns are added

as shown in [5]. To check the new script, Process Row is clicked and the results are updated. Notice thatk went from 0.144 to 0.142. Also notice that the filtered arrays are populated [5], and the number of rowsis 5830, compared with the original array size of 7522 rows.

Figure 44: Filtering and Storing Arrays

6.1.4 Advanced Steps

Saving the Plot It is possible to use the xs2png() function to save the generated plot to a picture file.An example is shown here where the plot is saved to file along side the data file. Several changes are madeto the code (Fig. 53) as listed below:

line 30 xdel() is used to clear any previous plots

line 34 To thin out the data points plotted, we plot every 50th data point using the call ‘1:50:$’, which isinterpreted as index/row 1, spaced by every 50th index/row, till the last row ($).

line 34 & 35 Note the call to plot() formats the lines using the extra inputs ‘o’ and ‘--’, in additionto the ‘LineWidth’ and ‘Color’ inputs. The ‘Color’ argument is supplied a [R,G,B] array.

line 36 & 37 The x-axis and y-axis are labeled

line 39 Using msscanf(), the spring description from F2 is decomposed.

line 40 The Spring Number is extracted and stored.

line 41 Using the msprintf() the plot title is built with the Spring Number, Temperature, Type, and kvalue.

line 43 The title is added to the plot

line 47 The file name and path are created using the automatic data path variable, F1, F2, and Name

line 49 The plot is printed to file using xs2png(). The call to gcf() grabs a reference to the current plot.

31

Page 32: Data Management With SciData

1 f u n c t i o n y=l i n e ( x , c o e f f s , params )23 b = c o e f f s (1 ) ; // i n t e r c e p t4 m = c o e f f s (2 ) ; // s l o p e56 y = m∗x+b ;78 end f un c t i o n9

10 f u n c t i o n y=e r r ( c o e f f s , x data , y data , params )11 y=l i n e ( x data , c o e f f s , params ) − y da ta ;12 end f un c t i o n1314 rws = f i n d ( Ext > 2 & Ext < 16) ;15 Ex t f = Ext ( rws ) ; // [mm] [ @]16 Load f = Load ( rws ) ; // [N ] [ @ ]1718 x da ta = Ex t f ;19 y da ta = Load f ;20 params = [ ] ;21 g u e s s c o e f f = [ 0 ; 0 ] ;2223 [ f , c o e f f o p t , g ] = l e a s t s q ( l i s t ( e r r , x data , y data , params ) , g u e s s c o e f f )2425 // s o l v e d s p r i n g con s t an t26 k = c o e f f o p t (2 ) ; // [N/mm] [# ]27 // s o l v e d i n t e r c e p t28 i = c o e f f o p t (1 ) ; // [mm] [# ]2930 x d e l ( w i n s i d ( ) ) // c l e a r p r e v i o u s p l o t s3132 x t e s t = 0 : 1 : 1 8 ;33 y t e s t = l i n e ( x t e s t , [ i ; k ] , [ ] ) ;34 p l o t ( x da ta ( 1 : 5 0 : $ ) , y da ta ( 1 : 5 0 : $ ) , ”o” )35 p l o t ( x t e s t , y t e s t , ”−−” , ” LineWidth ” ,2 , ” Co lo r ” , ” b l a c k ” )36 x l a b e l ( ” Ex t en s i on [mm] ” )37 y l a b e l ( ”Load [N] ” )3839 F2Parts = msscanf (F2 , ’%s%i ’ ) ; // Ex t r a c t Sp r i ng D e s c r i p t i o n Pa r t s40 SpringNum = F2Parts (2 ) ;41 S p r i n gT i t l e = msp r i n t f ( ” Sp r i ng=%i , Temperature=%s , Type=%s , k=%0.3 f [N/mm] ” ,

SpringNum , F1 , Type , k )4243 t i t l e ( S p r i n gT i t l e )44 l e g end ( [ ” data ” ; ” fun ” ] )4546 // f i l e path and name47 p l o t p a t h = data pa th + ’\ ’ + F1 + ’\ ’ + F2 + ’\ ’ + Name + ’ . png ’ ;48 // w r i t e the p l o t to a p i c t u r e f i l e49 xs2png ( gc f ( ) , p l o t p a t h ) ;

Figure 45: Advanced Edits to ‘k Calculation.sce’

By running this script across all the filtered rows, plots will be generated and saved for the respectivedatasets, creating a whole set of plots.

Writing a LaTex Document Test Summary It could be useful to have all the plots together in a gridfor reference. This can be done using LaTex, created using a Scilab Table Operation script (see next section).Images can be included in a LaTex document using the \includegraphics{filename} command. Thereforewe create a script in Scilab that creates the paths to all the plot images we just created and output thecommands to a LaTex file. This file is read into a master LaTex document using the \include command.The code to write the LaTex files is shown in Fig. 46.

The code begins with a function which helps write out the LaTex code. The function create figures() isused to create an array of LaTex code lines. The hot and cold rows are filtered using the find() functionand the respective LaTex code is written to file. Therefore, two separate LaTex files are created. These filesare then inserted into a master document, as shown in Fig. 47. The resulting document is shown in the

32

Page 33: Data Management With SciData

appendix.

1 f u n c t i o n y=c r e a t e f i g u r e s ( rws )2 n = l e ng t h ( rws ) ;3 t e x = [ ] ;45 f o r i = 1 : n67 t e x = [ t e x ; ”\ i n c l u d e g r a p h i c s [ h e i g h t =2.5 in , ” ] ;8 t e x = [ t e x ; ” type=png , ” ] ;9 t e x = [ t e x ; ” ex t =.png , ” ] ;

10 t e x = [ t e x ; ” read=.png ] ” ] ;1112 t e x = [ t ex ; m s p r i n t f ( ”{”” . . / Data/%s/%s/%s””}” , F1 ( rws ( i ) ) , F2 ( rws ( i ) ) ,

Name( rws ( i ) ) ) ] ;13 t e x = [ t ex ; ” ” ] ;14 t e x = [ t ex ; ” ” ] ;1516 y = tex ;17 end1819 end f un c t i o n2021 //Wri te the 25C F i l e22 rws = f i n d (F1 == ”25C” )23 t e x = c r e a t e f i g u r e s ( rws ) ;24 f i l e n ame = data pa th + ” \ . .\ t e x\ p l o t s 25C . t e x ” ;25 mde le te ( f i l e n ame ) ;26 w r i t e ( f i l e n ame , t e x ) ;2728 //Wri te the 70C F i l e29 rws = f i n d (F1 == ”70C” )30 c r e a t e f i g u r e s ( rws ) ;31 f i l e n ame = data pa th + ” \ . .\ t e x\ p l o t s 70C . t e x ” ;32 mde le te ( f i l e n ame ) ;33 w r i t e ( f i l e n ame , t e x ) ;

Figure 46: Code to Create LaTex Files

33

Page 34: Data Management With SciData

\documentclass[10pt,letterpaper]{article}

\usepackage{fullpage}

\usepackage{graphicx}

\usepackage{multicol}

\author{Brad Carman}

\title{Test Summary}

\begin{document}

\maketitle

\pagebreak

\section{New Springs}

\subsection{Temperature = 25$^{\circ}$C}

\begin{multicols}{2}

\input{plots_25C.tex}

\end{multicols}

\pagebreak

\subsection{Temperature = 70$^{\circ}$C}

\begin{multicols}{2}

\input{plots_70C.tex}

\end{multicols}

\end{document}

Figure 47: Code to Create LaTex Image Grid

7 Table Operations

7.1 Plotting Springs k Values

Now that we have calculated all the k values for all the datasets, we can plot them and make comparisons.

First, move to the Table tab (Fig. 48 [1]) and click Send Full [2]. Note that the whole table turns pink,representing that everything was sent to memory. Single values are stacked and sent as arrays and arraysare stacked (by column) into matrixes, as shown in Fig. 6.

34

Page 35: Data Management With SciData

Figure 48: Sending Table Data

As a quick visual of what is in memory, type ‘plot(k)’ [3] into the Command Box or simply ‘k’ [4] and hitenter. As can be seen, k is plotted [2] and printed in the Console Window [5].

7.1.1 Using AutoPlot

Our goal now is to show a plot of k values to show the comparison of Hot vs. Cold. There are two options:1.) write a script that can filter and plot the data, or 2.) use the SciData feature AutoPlot. To test out

AutoPlot, simply click the AutoPlot button (Fig. 49 [3]). As can be seen, an AutoPlot sheet will be added[4]. The column k can be selected for the ‘y’ value [5]. If we want to see a comparison of the temperaturewe know that the datasets were grouped by the folder, F1. Therefore, we choose F1 in the ‘Group 1 By’

selection box [6]. Replace the ‘Group 1 Header’ with ‘Temperature’ [7]. Now click the Test button [8].You will see the plot now shows the k values separated by temperature [9]. Note that the plot automaticallylabels the y-axis and legend. Compared to how this would be done in Excel, this is a major time saver.

35

Page 36: Data Management With SciData

Figure 49: Using AutoPlot

7.1.2 Saving an AutoPlot Script

Once you have the AutoPlot set to your liking, you can save it as a script. Simply add a title (Fig. 50 [1])

then click the Save button [2].

Figure 50: Saving an AutoPlot

When the AutoPlot is saved, it will open another script, as shown in Fig. 51 [1]. This script can then beedited to make additional changes to the plot.

36

Page 37: Data Management With SciData

Figure 51: Saved AutoPlot Script

7.1.3 Editing an AutoPlot Script

If we want to add a few custom items to the plot generated from AutoPlot, it is possible by editing the script.As can be seen in Fig. 52 [1], a few lines of code have been added with the goal of drawing a line at themean of the hot and cold groups. We can use the find() function again to get the rows of the hot and coldgroups. Then we can plot a line from the min to max x-axis values, as shown in lines 44 and 47. The minand max are found from the a structure variable. By opening and running the script in Scilab it can be seenwhat information is stored in a (as shown in Fig. 53). Notice that we have the axis limits available, alongwith all the plot settings.

Figure 52: Editing an AutoPlot Script

37

Page 38: Data Management With SciData

1 −−>a2 a =34 y : [ 20 x1 con s t an t ]5 group1 : [ 20 x1 s t r i n g ]6 group1Header : ”Temperature ”7 g1t : ”m”8 group2Header : ””9 g2t : ”c”

10 name : ”k P lo t ”11 l g nd po s : ” i n u p p e r r i g h t ”12 x t e x t : ””13 y t e x t : ”k [N/mm] ”14 z t e x t : ””15 f o n t s i z e : 316 mark s i z e : 517 g r i d : [−1,−1,−1]18 l o g f l a g s : ”nnn”19 show legend : 120 s h ow l i n e s : ”on”21 show markers : ”on”22 bar : 023 f i g u r e s i z e : [ 6 00 , 600 ]24 p l t : [ 1 x1 hand l e ]25 l gnd : [ 2 x1 s t r i n g ]26 lgndH : [ 2 x1 hand l e ]27 x min : 128 y min : 0 .135125829 x max : 1030 y max : 0 .1430398

Figure 53: AutoPlot Structured Variable a

38

Page 39: Data Management With SciData

8 Summary

In summary, this tutorial shows the follow benefits of Data Management combined with Script Based Anal-ysis:

• A flexible database of experiments/data files/folders/information that clearly documents the storeddata

• Filtering capability to quickly find and process information

• Separated Data and Math. One source to edit, keeping analysis and results in-sync.

• Script based plotting - one script can quickly create many plots, no manual time consuming workneeded.

• Combined with LaTex, reports can be automated

9 Appendix

39

Page 40: Data Management With SciData

1 New Springs

1.1 Temperature = 25◦C

2

Page 41: Data Management With SciData

3

Page 42: Data Management With SciData

1.2 Temperature = 70◦C

4

Page 43: Data Management With SciData

5