[{ALLOW edit EISMainUsers}]
[{ALLOW view Anonymous}]
The datasets for EIS Data Compression (DC) factor investigation are chosen as follows:

1. search [EIS study database|../../SolarB/EisStudyList.jsp] to find out which study has adopted a specific compression scheme such as DPCM, JPEG95, etc. For example, study 29, 30, 31, 32, 91, 92, 142, 255, 258 have ever used JPEG90 compression scheme.

''notes:''

* study 29-32 are designed specially for all of compression schemes test, however, only study 29, 30 have generated useful fits files. Also the selection of compression scheme was done in eis_mk_study tool, rather than eis_mk_raster, thus there was no scheme information in these fits header. In this case one needs to manually find out the scheme for every fits files by study 29, 30.

* Some studies use multiple rasters (eg. study 142), for convenience of MDP data packets calculation the investigation only focuses on single-raster studies.\\

* Check the txt files [study_summary.txt|../images/dcgifs/study_summary.txt] for more information about studies and their compression scheme selections.

\\

2. Once knowing details (eg. ID) of rasters, which using a specific compression scheme, one can extract original data volume for these rasters (through EIS planning database), and the corresponding fits files (through EIS science database), then work out the data set for the investigation. The file [datasets_list.txt|../images/dcgifs/datasets_list.txt] lists the datasets used in this case.

''notes:''

* The datasets chosen here are out from EIS eclipse season

* In the 'plot' figures shown below, data points towards right-side of X-axis imply themselves coming from more recently observations.


\\
\\
----

The way to calculate EIS on-board compressed data volume with MDP status curves was described in previous post abut EIS DC factor: [check it here|CompressionFactorStudy]. 

The following links show DC factor variation in different scenarios (eg. AR, QS for SCI_OBJ, 1", 2", 40", and 266" for SLA).

A conservative estimation of average factor is indicated by orange dash-line on the plot, which could be used to improve EIS planning tools.

\\

* [DPCM Scheme]

* [JPEG98 Scheme]

* [JPEG95 Scheme]

* [JPEG92 Scheme]

* [JPEG90 Scheme]

* [JPEG85 Scheme]

* [JPEG75 Scheme]

* [JPEG65 Scheme]

* [JPEG50 Scheme]

\\

''note:''

EIS Data Compression (DC) factors listed here are only for __Science Target__ & __Slit/Slot__ selection. For factor variation upon __exposure time__, please refer H. Hara's [results |../images/dcgifs/Compress_20080428_Hara.PNG], ([PDF|../images/dcgifs/Compress_20080428_Hara.pdf])

\\

\\

----

For easy analysis a data table can be worked out as follows, using the plots shown in above links:

\\



%%sortable
|| Scheme || Total || QS || AR || CH || 1" || 2" || 40" || 266"
|DPCM |2.12|2.19 (2.38)|2.19|2.19|2.54|2.46|2.19|2.18
|JPEG98|2.60|2.46|2.22|3.15|2.85 (avg)|2.31|2.28 (2.32)| 
|JPEG95|3.02|3.64|3.26| |6.12|6.38 (6.54)|3.26| 
|JPEG92|3.45|3.31 (avg)|6.65 (avg)|2.74 (avg)|6.53 (avg)|2.74 (avg)|3.63 (avg)| 
|JPEG90|4.01|3.82|4.34|3.73|4.67 (avg)| |3.84|3.78
|JPEG85|4.79|4.82|4.51|4.91|3.9 (avg)| |4.51| 
|JPEG75|5.89| |6.2| |18.88 (avg)| |6.2|7.52
|JPEG65|7.38| |19.94 (avg)| |25.97 (avg)| |7.33 (avg)| 
|JPEG50|9.39|20.98|17.72| |35.37 (avg)| |17.72|
%%

\\

Using DC factos (ie. 'total' in above table) to do trend fitting and compare wth those previous numbers used in EIS planning tool, we have:

''total EIS DC factor:''

%%table
||DPCM||JPEG98||JPEG95||JPEG92||JPEG90||JPEG85||JPEG75||JPEG65||JPEG50
|2.19|   2.33 |  3.26 |  3.63 |  3.82 |  4.51 |  6.20 |  7.33 |  9.36
%%

''previous EIS DC factor:''

%%table
||DPCM||JPEG98||JPEG95||JPEG92||JPEG90||JPEG85||JPEG75||JPEG65||JPEG50
|2.36  | 2.70  | 3.47 |  4.22 |  4.63 |  5.74 |  7.63 |  9.43 |  12.00
%%

For full history of EIS compression factor, see: [this page|fullHistoryOfDcFactor]

\\

[{Image src='images/dcgifs/new_eisDC.gif}]


The orange  dash-line in the figure can be described using equation: Y = 2.11768 + 0.556287*X - 0.0855122*X%%sup 2%% + 0.0161953*X%%sup 3%%

Thus, the corresponding numbers on the line are:

%%table
||DPCM||JPEG98||JPEG95||JPEG92||JPEG90||JPEG85||JPEG75||JPEG65||JPEG50
|2.12  | 2.60  | 3.02 |  3.45 |  4.01 |  4.79 |  5.89 |  7.38 | 9.39
%%

\\

\\

!!Some comments:

\\

%%information
During EIS science meeting at MSSL in May, the initial proposal to improve EIS planning tool is to set different values for a few scenarios, such as QS, AR, and FLR for SCI_OBJ, Slit (1"+2"), Slot(40"+266") for SLA. 

However in this calculation it looks that there is difference, but not too big, of compression factor in different scenarios, especially for example the DPCM compression (values are close to 2.19). This gives an idea we may use a set of single value of EIS compression factor for simplicity.
%%

\\

%%information
The nubers are quite different from the ones showned during EIS science meeting in MAY. Basically, there are two reasons for this:

* we got much more data points to cover all compression schemes (see [datasets|../images/dcgifs/datasets_list.txt] used in this calculation.

* the start-time & end-time in some fits headers are not corresponding to the time on MDP data packet curve
** For example, from eis_l0_20080501_054833.fits.gz header:

 strat: 2008-05-01T05:48:33\\
 end: 2008-05-01T05:50:03

[{Image src='images/dcgifs/eis_l0_20080501_054833.fits.gz.gif}]

''It's clear that: if using start-time & end-time in fits header, then the calculated MDP data volume is 0; if using raster duration to get end-time, only partial MDP data packets will be counted; using study duration seems fine to get completed data volume.''

\\

Another example: eis_l0_20080501_190013.fits.gz:

start: 2008-05-01T19:00:13\\
end: 2008-05-01T19:01:35

[{Image src='images/dcgifs/eis_l0_20080501_190013.fits.gz.gif}]

''For this fits file, if using start-time & end-time in fits header, then only half MDP data packets are counted, which making compression factor double; if using raster duration and study duration to get end-time, both will get completed data volume and generate reasonable compression factor.''
%%

\\

%%information
However, at this stage, not sure which one is not correct: time in fits header or time in MDP packet curve? 

This mismatch, ie. the data coming is slower than the expection, might explain MDP data recorder full on-board, for example, in the case that EIS data packets for previous and current studies are coming very close in time!
%%

\\

%%information
There is a structure array to store all related information for the EIS data investigated here. The array has element with the following format:

{{{
compFactor={compression_factor, $
          study_ACR     :'', $  ;string
          study_id      :'', $  ;string
          rast_ACR      :'', $  ;string
          rast_id       :'', $  ;string
          ll_ACR        :'',$   ;string
          ll_id         :'',$   ;string
          start_time    :'', $  ;string
          end_time      :'', $  ;string
          fitsname      :'',$   ;string
          target        :'',$   ;string
          sci_obj       :'',$   ;string
          slit          :'',$   ;string
          def_volume    :0LL,$  ;long64 int, unit: bits
          mdp_volume    :0.0,$  ;float, unit: kbits
          comp_scheme   :0,$    ;int
          nexp          :0,$    ;int
          rast_req      :0,$    ;int
          exposures     :fltarr(8) $    ;float, unit: sec
        }
}}}

I attached an [IDL sav file |../images/dcgifs/str1_fitsheader_endtime_fixed.sav.tar.gz]. You may download and play it, for example, I use: 

{{{if (str1[i].SCI_OBJ eq 'QS') && (str1[i].COMP_SCHEME eq 1) && (str1[i].MDP_VOLUME gt 0.) then ind[i]=1}}}

to extract records associated with 'QS' SCI_OBJ and using DPCM compression scheme.
%%
\\

JianSun (MSSL) - 2008-06-09
----

\\

[{InsertPage page=EisCompressionTest}]

\\

JianSun (MSSL) - 2008-07-07
----