Import data saved with Mass-Up in external applications

The Save Data operation allows you to save your data into .csv files. This may be interesting if you have preprocessed a raw dataset and want to store the preprocessed data for further analysis with Mass-Up or other applications, such as R.


R

If you want to load a .csv spectra files with R, the easiest way is to use the MALDIquant and MALDIquantForeign packages, which allow you to import spectra from .csv files with the import function.

Import a single .csv spectra file with MALDIquant

Consider that you have a file called spectra.csv with the following content:

	Mass,Intensity
	72.38649,4.7928915
	92.86101,11.554423
	103.110954,23.025375
	115.28742,8.338575
	135.57188,76.37024
	137.58994,57.889793

You can import this spectra into a list by running the following R commands:

	library("MALDIquantForeign");
	spectra <- import("spectra.csv");

Note that spectra is a list so if you type spectra[[1]] in the R console, you will see the loaded data:

	> spectra[[1]]
	S4 class type            : MassSpectrum         
	Number of m/z values     : 6                    
	Range of m/z values      : 72.386 - 137.59      
	Range of intensity values: 4.793e+00 - 7.637e+01
	Memory usage             : 1.523 KiB            
	File                     : /tmp/spectra.csv

Import a dataset with MALDIquant

Consider that you have saved your preprocessed dataset into a directory called dataset, which has three condition sub-directories called:

Each sub-directory may contain one or more sub-directories for each sample, which at the same time can have one or more .csv spectra files.

If you want to load all the spectra into a list, you just have to run the following R commands:

	library("MALDIquantForeign");
	spectra <- import("dataset");

Within this command, all the spectra are loaded into a plain list so you should process this list in order to extract the spectra from the sample or condition that you want. Let's consider that you want to create one separated list for each sample. The first thing you can do is to get the sample names by reading the directory names:

	sampleNames <- list.dirs(path="dataset", recursive=TRUE)
	sampleNames <- sampleNames[c(3:4,6:10,12:16)]
	sampleNames <- gsub(".//HEALTHY/", "", sampleNames)
	sampleNames <- gsub(".//LYMPHOMA/", "", sampleNames)
	sampleNames <- gsub(".//MYELOMA/", "", sampleNames)
Now, in sampleNames you have a list with the names of your samples:
	> sampleNames
	[1] "HA" "HB" "LA" "LB" "LC" "LD" "LE" "MA" "MB" "MC" "MD" "ME"
Since all samples have the same number of replicates, it is easy to retrieve them with the following code snippet: if you want to get a list with the spectra of the ith sample, you just have to set the ith variable:
	ith		<- 1 # A value between 1 and length(sampleNames)
	spectraIndex 	<- (ith-1)*5
	sample.name 	<- sampleNames[ith]
	sample.spectra 	<- spectra[spectraIndex:(spectraIndex+5)]
And now, you have the information of the first sample stored in sample.name and sample.spectra:
	> sample.name
	[1] "HA"
	
	> sample.spectra
	[[1]]
	S4 class type            : MassSpectrum      
	Number of m/z values     : 155               
	Range of m/z values      : 656.165 - 3349.394
	Range of intensity values: 3e-03 - 1e+00     
	Memory usage             : 3.938 KiB         
	File                     : /tmp/dataset/HEALTHY/HA/spectrum1.csv
	
	[[2]]
	S4 class type            : MassSpectrum      
	Number of m/z values     : 144               
	Range of m/z values      : 656.152 - 3349.637
	Range of intensity values: 5e-03 - 1e+00     
	Memory usage             : 3.766 KiB         
	File                     : /tmp/dataset/HEALTHY/HA/spectrum2.csv
	
	[[3]]
	S4 class type            : MassSpectrum      
	Number of m/z values     : 116               
	Range of m/z values      : 656.173 - 3348.615
	Range of intensity values: 2e-03 - 1e+00     
	Memory usage             : 3.328 KiB         
	File                     : /tmp/dataset/HEALTHY/HA/spectrum3.csv
	
	[[4]]
	S4 class type            : MassSpectrum      
	Number of m/z values     : 139               
	Range of m/z values      : 656.162 - 3349.348
	Range of intensity values: 4e-03 - 1e+00     
	Memory usage             : 3.688 KiB         
	File                     : /tmp/dataset/HEALTHY/HA/spectrum4.csv
	
	[[5]]
	S4 class type            : MassSpectrum      
	Number of m/z values     : 118               
	Range of m/z values      : 656.177 - 3349.325
	Range of intensity values: 7e-03 - 1e+00     
	Memory usage             : 3.359 KiB         
	File                     : /tmp/dataset/HEALTHY/HA/spectrum5.csv