R

aroma

- An R Object-oriented Microarray Analysis environment

in 100% R1

(formerly known as the com.braju.sma package)

Version ERROR: Please tell hb@maths.lth.se about this!, 2001-2006

Henrik Bengtsson
Division for Mathematical Statistics, Centre for Mathematical Sciences
Lund University, Sweden


Related publications for microarray analysis

Since the text following below is a bit out of date, we recommend that you also read the following papers available through http://www.maths.lth.se/bioinformatics/publications/.

A. Bengtsson and H. Bengtsson, Microarray image analysis: background estimation using quantile and morphological filters, BMC Bioinformatics, 2006, 7:96.

H. Bengtsson and O. Hössjer, Methodological study of affine transformations of gene expression data with proposed normalization method, BMC Bioinformatics, 2006, 7:100.

H. Bengtsson, G. Jönsson and J. Vallon-Christersson, Calibration and assessment of channel-specific biases in microarray data with extended dynamical range, BMC Bioinformatics, 2004, 5:177.

H. Bengtsson, Low-level Analysis of Microarray Data, Doctoral Thesis in Mathematical Sciences 2004:6, Mathematical Statistics, Lund University, 2004.

H. Bengtsson and O. Hössjer, Affine calibration for microarrays with dilution series or spike-ins, Preprints in Mathematical Sciences 2004:19, Mathematical Statistics, Lund University, 2004.

H. Bengtsson, aroma - An R Object-oriented Microarray Analysis environment, Preprints in Mathematical Sciences 2004:18, Mathematical Statistics, Lund University, 2004.

H. Bengtsson, Identification and normalization of plate effect in cDNA microarray data, Preprints in Mathematical Sciences, 2002:28, Mathematical Statistics, Lund University, 2002.

H. Bengtsson, B. Calder, I. S. Mian, M. Callow, E. Rubin, and T. P. Speed, Identifying Differentially Expressed Genes in cDNA Microarray Experiments, Science SAGE KE 2001 (12), vp8. [DOI: 10.1126/sageke.2001.12.vp8]

See the Bioinformatics Group for a more thorough list of publications and talks.


Table of Content

1. Introduction

The aroma package extends the ideas started out by the sma package [1] package written by the Speed Group at UC Berkeley. At the beginning the sma package was required to be installed too, but now it is only needed for running some of the examples, basically because the examples uses the data in the sma package. Note that the aroma package has been designed to been backward compatible with sma, i.e. the basic data structures of this package can also be used by the sma package without any transformation. A lot of credits to Ben Bolstad, Sandrine Dudoit, Ingrid Lönnstedt, Natalie Roberts and Jean Yee Hwa Yang (in alphabetic order) for writing the sma package. Other known packages related to sma and aroma are smawehi and genomics.sma.

I wrote this package for two reasons. First, as the cDNA microarray technology is part of my PhD research topic I needed a good programming foundation to work with and I wanted it to be object oriented. Second, I found that the object-oriented approach made the microarray analysis much more user-friendly too and I decided to make the package easy to use for also non-programmers and non-statistician. As a result of making the package truly object oriented a core package named R.oo was created to provide generic ways of defining classes and to implement support for references in R, cf. [2][3].

1.1 Features

For download and installation instructions see the Download and installation section at the end.

2. Loading the package and running an example

To start using the aroma, the package has to be loaded into the [R] environment. This is simply done by typing

> library(aroma)

at the [R] prompt. This will automatically load other required packages such as R.oo, R.io etc. When the packages are loaded they will produce some messages and warnings, which are all expected;

> library(aroma)
Loading required package: R.oo
R.oo v0.63 (2004-03-03) was successfully loaded.
Loading required package: R.io
R.io v0.50 (2004-03-10) was successfully loaded.
Loading required package: R.graphics
Loading required package: R.colors
R.colors v0.50 (2004-03-10) was successfully loaded.
R.graphics v0.50 (2004-03-10) was successfully loaded.
aroma v0.69 (2004-03-10) was successfully loaded.
>

Try one of the many examples found in the documentation (via help.start()) by typing

> example(MAData)

which will load the sma package (if it is not already loaded), use its example data, perform some normalization methods and display some plots.

3. Importing data from file

The package can import data from several different types of microarray image analysis software and from the different versions of those.

3.1 Reading data from file

The following file formats can be read by the package:

Source software Typical extensions Static methods
GenePix *.gpr GenePixData$read()
ScanAlyze *.dat ScanAlyzeData$read()
ImaGene *.txt ScanAlyzeData$read()
QuantArray *.txt QuantArrayData$read()
Spot *.spot, *.dat SpotData$read()
Spotfinder *.tav SpotfinderData$read()

Instead of specifically calling the read() method of the correct class one can use the static method MicroarrayData$read() will try each of the above class methods automatically (MicroarrayData is a superclass of all the above listed classes). This means that one do not have to worry about what file format the file has and just call MicroarrayData$read() whatever file format it has. However, all files trying to be imported have to be of the same file format. Calling the correct class method explicitly will however always be faster.

3.2 Reading one slide

It is simple to read one slide from a file, which is in any of the above data formats. Here is an example of reading a GenePix file and reading data in the other data formats in analogous:

gpr <- MicroarrayData$read("gpr123.gpr")

The result of this code is that the data is now stored in a object called gpr. If the data is not located in the current directory one can specify the data directory with the argument path. Here is how one load data from the data/ directory installed with the aroma package (and prestored in the aroma data structure):

gpr <- MicroarrayData$read("gpr123.gpr", path=aroma$dataPath)

Now, try to type gpr at the prompt (and press ENTER):

> gpr
[1] "GenePixData: Number of fields: 44. Layout: Layout: Grids: 4x4 (=16), spots
in grids: 18x18 (=324), total number of spots: 5184. Spot names and id's are 
specified."

This tells you that the gpr object is of class GenePixData and it contains 44 fields and that the microarray has layout of 16 grids each with 324 spots, in total 5184.

Note: Microarrays analyzed by ImaGene are saved in two separate files, which means that you have to read in two ImaGeneData objects, e.g.

igG <- MicroarrayData$read("imagene234.cy3", path=aroma$dataPath)
igR <- MicroarrayData$read("imagene234.cy5", path=aroma$dataPath)

3.3 Reading several slides

It is possible to read several files at once given that they are all of the same format, e.g. all files are GenePix Results files. At the moment of writing it is not possible to read and combine files of different formats. For instance, to read the two ScanAlyze files one write:

filenames <- c("group4.dat", "group7.dat")
sa <- MicroarrayData$read(filenames, path=aroma$dataPath)

Looking at the sa

> sa
[1] "ScanAlyzeData: Number of slides: 2. Number of fields: 33. Layout: 
Grids: 2x2 (=4), spots in grids: 7x7 (=49), total number of spots: 196."

we see that it is an object is of class ScanAlyzeData and it contains 33 fields and that the two slides has layout of 4 grids each with 49 spots, in total 196.

It is also possible to read all files whose names matches a certain regular expression pattern. To do this one has to specify the pattern argument as this example:

sa <- MicroarrayData$read(pattern="group.*.dat", path=aroma$dataPath)

which reads the same two files as before.

Note: Currently it is not possible to read several ImaGene data files at once. The reason for this is that I have not decised on what the arguments should be called. Remember that for each slide ImaGene has two seperate files, whereas all other software supported has one file.

4. Excluding bad spots

4.1 A first word

Instead of excluding "bad" spots from the beginning, it migHt be wiser to keep all spots and instead exclude them from the results at the end. A "bad" spot is after all not totally useless for some of the methods/algorithms.

4.2 Excluding bad spots

To prevent a spot from being used in the analysis its value(s) should be set to NA. Spots with NA values are not considered by the methods in the package. Here is an example where spots in an MAData object are exclude:

badspots <- c(1,34,35, 105:120)
slides <- c(1,3)
ma$M[badspots, slides] <- NA

To exclude the spot specified by badspots in all slides just exclude the slide specification;

badspots <- c(1,34,35, 105:120)
ma$M[badspots,] <- NA

As shown here, it is normally enought to exclude one of the fields, e.g. M.

4.3 Excluding spots flagged as bad by image analysis software

Some image analysis packages such as GenePix flags spots by certain criteria or lets the user to flag spot as bad. To exclude such spots one first needs to identify which spot on which slide are bad by looking at the Flags field in the GPR data. Consider the data (gpr and raw) from the two slides loaded by example(GenePixData.read):

bad <- which(gpr$Flags < 0, arr.ind=TRUE)

where bad is a matrix where each row contains a spot index and a slide index:

> bad
         row col
[1,]       8   1
[2,]      41   1
...
[1478,] 5184   2

To exclude these spots from the RawData object raw:

raw$R[bad[,1], bad[,2]] <- NA

Advanced (and easier) way: Making use of the fact that [R] can treat matrices as vectors and that all MicroarrayData objects in aroma stores the data in matrices of the same shape one can do above as

bad <- (gpr$Flags < 0)
raw$R[bad] <- NA

As shown above, it is normally enought to exclude the values in one of the two foreground channels since most of the algorithms down the stream makes requires both channels in one way or another. For example, extracting the signal

ma <- getSignal(raw)

one see that the bad spots gets M and A values that are NA's:

> ma$M[bad]
NA NA NA ... NA
> ma$A[bad]
NA NA NA ... NA

5. Extracting and transforming the signal

5.1 Extract foreground and background signals

After having loaded data from file one has to extract raw data. By the raw data we mean the red and the green foreground (R, G) and the red and the green background (Rb, Gb) signals. The method as.RawData() extract the raw data. For instance, given a GenePixData object:

gpr <- GenePixData$read(pattern="*.gpr", path=aroma$dataPath)
we can extract the raw signal as
raw <- as.RawData(gpr)

If we type raw at the prompt, we will get

> raw
[1] "RawData: R (5184x2), G (5184x2), Rb (5184x2), Gb (5184x2), Layout: 
Grids: 4x4 (=16), spots in grids: 18x18 (=324), total number of spots: 
5184. Spot names and id's are specified."

which tells us that the RawData object contains R, G, Rb, and Gb signals for 5184 spots from 2 slide. The layout is of course the same as it was for the GenePixData object. The fields R, G, Rb and Gb, which are matrices, can be accessed via the $ operator (as for lists);

raw$R   # Red channel foreground signal
raw$G   # Green channel foreground signal
raw$Rb  # Red channel background signal
raw$Gb  # Green channel backround signal

In the ImaGene case you have to read in two files, one for each channel. Since each microarray slide is then represented by two ImaGeneData objects, as.RawData() takes two objects in that case, e.g.

igG <- ImaGeneData$read("imagene234.cy3", path=aroma$dataPath)
igR <- ImaGeneData$read("imagene234.cy5", path=aroma$dataPath)
raw <- as.RawData(igR, igG)

5.2 Doing background substraction or not and transforming

The first decision to make after been loading the data is if background subtraction or not should be performed. How to decide on this is discussed in another chapter and here we will merely show how it is done.

Extracting the actual spot signal, is done by either subtracting the background from the foreground signals in the two seperately, or simply by discarding the background and treating the foreground as the true signal. The method getSignal() provides an easy way of doing this and, for convenience, it also transforms the (R,G) signal into (M,A) signal:

ma <- getSignal(raw, bg.subtract=TRUE)

giving

> ma
[1] "MAData: M (5184x2), A (5184x2), Layout: Grids: 4x4 (=16), spots
in grids: 18x18 (=324), total number of spots: 5184. Spot names and 
id's are specified."

If you do not want to do background subtraction just write bg.subtract=FALSE instead. As with RawData objects one can also get the M and the A signals (matrices) by ma$M and ma$A.

5.2.1 More on the (R,G) to (M,A) transformation

As pointed out, getSignal() also transforms the background (R-Rb,G-Gb) or non-background (R,G) corrected signals to (M,A) signals. In short one can say that the M value of a spot is the logarithm (base 2) of its red to green ratio and that the A value is the logarithm (base 2) of the same spot's intensity. In details this transform is:


M = log2(R/G), A = 1/2·log2(R·G)

<=>

R = (22A+M)1/2, G = (22A-M)1/2.
(1)

The reason for using logarithm with base 2 is that the source data is binary, which (for most data sources) brings the log-intensity into the range 0 <= A <= 16. Further, a log-ratio of M=+1 (M=-1) means that the red (green) signal is twice as large as the green (red) signal.

The (R,G) to (M,A) transform is inversible and the (R,G) corrected signals can be retrieved from an MAData object by as.RGData(ma), but this is rarely wanted or needed.

rg <- as.RGData(ma)

giving

> rg
[1] "R (5184x2), G (5184x2), Layout: Grids: 4x4 (=16), spots in
grids: 18x18 (=324), total number of spots: 5184. Spot names and
id's are specified."

6. Subtracting background

6.1 Introduction

It is commonly believed that background subtraction should always be done, but this not always the best thing to do as seen here. Raw data for one cDNA microarray slide typically consists of two scanned 16-bit images. During the scanning a green and a red laser excite the Cy3 and Cy5 dyes and the emission is registered. From these two images at least four different measures for each spot are extracted. The foreground signals, R and G, are often measured as the mean or the median of the intensities of the pixels that are identified to belong to the spot. There are several ways to define what the background of a spot is and different definitions often result in different background estimates, Rb and Gb. All background identification methods strive to get a representative estimate of the background noise in the area of the spot, a noise that is also expected to be added to the foreground signals. With a good estimate of the foreground and the background signals it is reasonable to believe that the background subtracted signals, R'=R-Rb and G'=G-Gb, reflects the gene expression levels.

6.1.1 Comparison between different image analysis softwares

It is important to remember that the spots are not always easily identifiable regions in the images, but can often be weak or smeared out like comets. It also happens that two or more spots are so large that they overlap or that the microarray slides are contaminated with dust and scratches. For reasons like these it is important to have good models that estimates of the foreground and the background signal. Yang et al [4] compare the different foreground and background identification and estimation methods that are used by the most commonly used image analysis software, such as GenePix, QuantArray, ScanAlyze and Spot. GenePix uses so called adaptive circles, QuantArray uses different methods like fixed circles, adaptive circles (aka Chen's method) and histogram techniques. ScanAlyze is only using fixed circles whereas Spot is using different seeded region growing algorithms to estimate the foreground and background signals. Based on two replicated experiments the authors presents one could make a simple ranking of image analysis methods; starting with the best: Spot, QuantArray (histogram), GenePix, ScanAlyze. The methods of QuantArray that uses fixed or adaptive circles were performing really bad and gave much higher noise in their replicated data. When doing no background correction all softwares are doing almost equally well, excluding QuantArray's adaptive circles.

6.2 Deciding on background substraction or not

In figure 1 below, background and non-background subtracted GenePix data is plotted. The plots show that doing background subtraction could introduce noise in the lower intensity regions. The artifacts to the left that looks like a peacock feather is due to discrete noise. There is simply no possible combination of foreground and background signals that produces a value in between the "peacock stripes". In that range of intensity the background signals are subtracted from really small forground signal.

M vs A plot of background corrected GenePix data M vs A plot of non-background corrected GenePix data
Figure 1. Left: Background corrected data. Right: Non-background corrected data. Since this data was extracted by GenePix and the intensity dependent effect is not too large even if background subtraction is not done it is suggested to not do background correction.

In a second data set also from GenePix we get the same peacock tails when we correct for the background. However, if background correction is not done there is a lot intensity dependent effects. See figure 2. For these reasons it is suggested to do background correction. It is not obvious, but at least it is a decision.

M vs A plot of background corrected GenePix data M vs A plot of non-background corrected GenePix data
Figure 2. Left: Background corrected data. Right: Non-background corrected data. This data (different data set) was also extracted by GenePix. Here the intensity dependent effect is larger and it is suggested to do background correction.

As a last example we are showing data (third data set) extracted by the Spot software, c.f. figure 3. In this case no peacock tails are seen when background correction is done.

M vs A plot of background corrected GenePix data M vs A plot of non-background corrected GenePix data
Figure 3. Left: Background corrected data. Right: Non-background corrected data. No peacock tails are seen for the background corrected data. The image analysis package used was Spot.

6.2.1 Thumb rules

Unfortunately, it is not possible to give an generic formula for deciding when background subtraction should be done or not. Below are some suggestions to help deciding if background subtraction should be done or not. These should not be considered as rules, but suggestions. Sometimes one simply has to try both approaches and see what results one gets in the end, because there is probably some biological prior belief of what genes to find.

Software Segmentation method Background correction? Comments
Spot morph probably  
morph.close.open yes This is the best Spot method.
valley probably not  
GenePix adaptive circles probably If there is a significant peacock effect avoid background subtraction.
ScanAlyze fixed circles probably If there is a significant peacock effect avoid background subtraction.
QuantArray histogram yes This is the best QuantArray method.
fixed circles probably not  
adaptive circles probably not  

6.3 How to do it

As seen in the previous section background subtraction is done when the (M,A) signal is extracted from the RawData object;

ma <- getSignal(raw, bg.subtract=TRUE)
ma <- getSignal(raw, bg.subtract=FALSE)

7. Normalizing data

Normalization of cDNA microarray data is currently done 1) slide by slide and 2) across slides. The slide-by-slide normalization should be done before the across-slides normalization.

7.1 Normalizing within slide

Slide-by-slide normalization is also referred to as within-slide normalization. There are several different within-slide normalization methods to choose from and which to choose depends on the data and also what you believe could be the reason for having artifacts. All within-slide normalization methods assumes that most genes are non-differentially expressed. So what does it mean "by most"? For fitting the curve, the LOWESS method is used. LOWESS is a robust method that fits local linear regression robustly such that up to 50% outliers (differentially expressed genes etc) on both sides can exist without affecting the fit too much. This can be compared how the median allow 50% outliers whereas the mean can not take a single one. However, not that the "up-to-50%-outliers" rules is not global; if you have 50% outliers, they should be spread all over the intensity range. The result of the curve fit is also best if the number of outliers are approximately equal above and below the curve. To conclude, in practice you may not be able to have 50% differentially expressed genes (and other outliers), but probably 10-30%. A thumb rule is to plot the data and the curve and see if it looks allright.

[Note that the above paragraph has been updated. Before it said "By most, it is often meant about 95% of the genes/spots...", but that is a quite strong requirement. The new one is more reasonable. Thanks Gordon Smyth for commenting on this. /HB 2004-01-13]

If this is not true, it could be very very hard to motivate normalization based on any of the following within-slide normalization methods. Currently, there are no within-slides methods that does not assume the above. However, in some situations it is actually possible to (re)design the experiments in such a way that the above hold. For ideas see chapter Designing Time-series Experiments (still to be written).

The available within-slide normalization methods in aroma are listed in the table below:

Method Argument
Scaled printtip method="s"
Printtip method="p"
Whole-slide (lowess) method="l"
Median method="m"

In the following sections these different normalization methods are explained. Note that only one of the methods are needed. The scaled printtip normalization methods will perform a printtip normalization and the printtip normalization will also have an effect on the overall intensity artifact of the slide. Also, any of these normalization methods will ensure that the median of the log-ratios after the normalization is zero.

7.1.1 Whole-slide (lowess) normalization

normalizeWithinSlide(ma, method="l")

7.1.2 Print-tip normalization

normalizeWithinSlide(ma, method="p")

7.1.3 Scaled print-tip normalization

normalizeWithinSlide(ma, method="s")

7.1.4 Median normalization

In addition to the above normalization methods there is also the (too) commonly used median normalization method. This simply estimates the median ("the average not affected by outliers") without caring of intensity dependent artifacts. If you still would to use this method just do:

normalizeWithinSlide(ma, method="m")

7.2 Normalizing across slides

The across-slides normalization method scales the log-ratios for each slide so that all slides get the same spread as measured by absolute median deviation (MAD), which is a robust variance estimate. Applying across-slides normalization is more straight forward, since there are in most cases no need to make any decisions:

normalizeAcrossSlides(ma)
M vs A plot
zoom

8. Exporting data to file

8.1 Writing data to file

The following file formats can be written to file:

Format Typical extensions Multiple slides Methods
Tab delimited
(Excel etc)
*.dat yes write(<MicroarrayData object>)
GenePix *.gpr no write(<GenePixData object>)
ImaGene *.txt no write(<ImaGeneData object>)
QuantArray *.txt no write(<QuantArrayData object>)
ScanAlyze *.dat no write(<ScanAlyzeData object>)
Spot *.spot, *.dat no write(<SpotData object>)
Spotfinder *.tav no write(<SpotfinderData object>)

Note that it is only a GenePixData object that can be written to a GenePix Result file, a SpotData object that can be written to a Spot file etc. It is not possible to write for instance a SpotData object to a GenePixData object an so on. It is neither possible to directly write a MAData object to a GenePix Result file.

Also note that due to the file format specifications of the GenePix Result, the ImaGene, the QuantArray, the ScanAlyze and the Spot file formats it is not possible to write more than one slide to each file.

8.2 Writing one slide to file

Say that we have a ScanAlyzeData object containing two slides:

sa <- ScanAlyzeData$read(pattern="group.*.dat", path=aroma$dataPath)

giving

> sa
[1] "ScanAlyzeData: Number of slides: 2. Number of fields: 33. Layout: 
Grids: 2x2 (=4), spots in grids: 7x7 (=49), total number of spots: 196."

To save the second slide in a new file named slide2.dat to the current directory one use the class method write() as:

write(sa, "slide2.dat", slide=2)

and the file will be in ScanAlyze file format. To force it to be in a tab-delimited file format see section below.

8.3 Write slides to a tab-delimited file

For a MicroarrayData object of a class that do not implement a special write() method, e.g. like GenePixData, ScanAlyzeData and SpotData, the default is to write it as a tab-delimited file. Here is an example that reads some microarray slides, normalizes them and writes the normalized data back to file:

raw <- as.RawData(sa)
ma <- getSignal(raw)
normalizeWithinSlide(ma, method="l")
write(ma, "normalized.dat")
The file written will contain data from both slides. Here is an excerpt of the file:
"slide"	"spot"	"M"	"A"
1	1	1.02470822498539	9.82813561118849
1	2	0.84651744359333	9.71060507370453
1	3	NA	NA
...
1	196	0.178051472468407	6.31056805663732
2	1	-0.700861412028143	14.2345609967426
2	2	-0.0412738551502208	13.8118980240359
...
2	195	0.369976778475562	12.1762311081584
2	196	0.0776465730445551	12.8780668743885

8.4 To force file format to be tab-delimited

As mentioned before, objects of class GenePixData, ScanAlyzeData and SpotData is written in their own file format. To force such object to be written as tab-delimted data one call the write method in the superclass instead. For instance, the ScanAlyzeData extends the class MicroarrayData (which extends class Object), and the write() method implemented for MicroarrayData objects is, as previous section described, writing tab-delimited files:

write.MicroarrayData(sa, "sadata.dat")

which actually writes all slides to the same file. Use argument slides to write only some of the slides.

9. Other documentation

The manual pages, which comes with the installation, can also be found online at:

10. Downloading and installation

Please see http://www.braju.com/R/ for instructions.

11. Licence

It is still to be decided if this package should be published under either the GPL or the LGPL license. It will be the one that will be most useful for everyone.

12. Citing this document

Whenever using this package please cite this document according to:

@TECHREPORT{BengtssonH_2004,
  author = {Bengtsson, Henrik},
  title = {{aroma} - {A}n {R} {O}bject-oriented {M}icroarray {A}nalysis environment},
  institution = {Mathematical Statistics, Centre for Mathematical Sciences, Lund University, Sweden},
  year = {2004},
  type = {{Preprint in Mathematical Sciences}},
  number = {2004:18},
}

References

[1] Y. H. Yang et al, Statistics for Microarray Analysis (sma), Statistics Department, University of California Berkeley, 2002.
http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html
[2] Henrik Bengtsson, The R.oo package - Object-Oriented Programming with References Using Standard R Code. In Kurt Hornik, Friedrich Leisch and Achim Zeileis, editors, Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), March 20-22, Vienna, Austria.
http://www.ci.tuwien.ac.at/Conferences/DSC-2003/
[3] Henrik Bengtsson, Programming with References - a case study using the R.oo package, Division for Mathematical Statistics, Centre for Mathematical Sciences, Lund University, Sweden, 2002.
http://www.maths.lth.se/help/R/
[4] Y. H. Yang, M. J. Buckley, S. Dudoit and T. P. Speed. Comparison of methods for image analysis on cDNA microarray data. Technical report #584, University of California, Berkeley. November 2000.
http://www.stat.berkeley.edu/users/terry/zarray/Html/image.html

About this page

This page was dynamically generated. In other words, you can trust that the version number shown, the list of contained packages of the bundle and the files linked to, reflect the latest version of this package.




horizontal ruler
1By 100% [R] I mean that all the source code is written in [R] and no external native code nor C or Fortran code has been used. This means that installation of the package is straightforward since no compilation is required.