NewNumberForEISDC
Back to current versionRestore this version

The datasets for EIS Data Compression (DC) factor investigation are chosen as follows:

1. search EIS study database(info) to find out which study has adopted a specific compression scheme such as DPCM, JPEG95, etc. For example, study 29, 30, 31, 32, 91, 92, 142, 255, 258 have ever used JPEG90 compression scheme.

notes:


2. Once knowing details (eg. ID) of rasters, which using a specific compression scheme, one can extract original data volume for these rasters (through EIS planning database), and the corresponding fits files (through EIS science database), then work out the data set for the investigation. The file datasets_list.txt(info) lists the datasets used in this case.

notes:




The way to calculate EIS on-board compressed data volume with MDP status curves was described in previous post abut EIS DC factor: check it here.

The following links show DC factor variation in different scenarios (eg. AR, QS for SCI_OBJ, 1", 2", 40", and 266" for SLA).

A conservative estimation of average factor is indicated by orange dash-line on the plot, which could be used to improve EIS planning tools.



note:

EIS Data Compression (DC) factors listed here are only for Science Target & Slit/Slot selection. For factor variation upon exposure time, please refer H. Hara's results, (PDF(info))




For easy analysis a data table can be worked out as follows, using the plots shown in above links:


Scheme Total QS AR CH 1" 2" 40" 266"
DPCM 2.192.19 (2.38)2.192.192.542.462.192.18
JPEG982.332.462.223.152.85 (avg)2.312.28 (2.32)
JPEG953.263.643.26 6.126.38 (6.54)3.26
JPEG923.633.31 (avg)6.65 (avg)2.74 (avg)6.53 (avg)2.74 (avg)3.63 (avg)
JPEG903.823.824.343.734.67 (avg) 3.843.78
JPEG854.514.824.514.913.9 (avg) 4.51
JPEG756.20 6.2 18.88 (avg) 6.27.52
JPEG657.33 19.94 (avg) 25.97 (avg) 7.33 (avg)
JPEG5017.7220.9817.72 35.37 (avg) 17.72


Using DC factos (ie. 'total' in above table) to do trend fitting and compare wth those previous numbers used in EIS planning tool, we have:

total EIS DC factor:

DPCMJPEG98JPEG95JPEG92JPEG90JPEG85JPEG75JPEG65JPEG50
2.19 2.33 3.26 3.63 3.82 4.51 6.20 7.33 9.36

previous EIS DC factor:

DPCMJPEG98JPEG95JPEG92JPEG90JPEG85JPEG75JPEG65JPEG50
2.36 2.70 3.47 4.22 4.63 5.74 7.63 9.43 12.00

The orange dash-line in the figure can be described using equation: Y = 2.11768 + 0.556287*X - 0.0855122*X2 + 0.0161953*X3

Thus, the corresponding numbers on the line are:

DPCMJPEG98JPEG95JPEG92JPEG90JPEG85JPEG75JPEG65JPEG50
2.12 2.60 3.02 3.45 4.01 4.79 5.89 7.38 9.39



Some comments:#


During EIS science meeting at MSSL in May, the initial proposal to improve EIS planning tool is to set different values for a few scenarios, such as QS, AR, and FLR for SCI_OBJ, Slit (1"+2"), Slot(40"+266") for SLA.

However in this calculation it looks that there is difference, but not too big, of compression factor in different scenarios, especially for example the DPCM compression (values are close to 2.19). This gives an idea we may use a set of single value of EIS compression factor for simplicity.


The nubers are quite different from the ones showned during EIS science meeting in MAY. Basically, there are two reasons for this:
  • we got much more data points to cover all compression schemes (see datasets(info) used in this calculation.
  • the start-time & end-time in some fits headers are not corresponding to the time on MDP data packet curve
    • For example, from eis_l0_20080501_054833.fits.gz header:

strat: 2008-05-01T05:48:33
end: 2008-05-01T05:50:03

It's clear that: if using start-time & end-time in fits header, then the calculated MDP data volume is 0; if using raster duration to get end-time, only partial MDP data packets will be counted; using study duration seems fine to get completed data volume.


Another example: eis_l0_20080501_190013.fits.gz:

start: 2008-05-01T19:00:13
end: 2008-05-01T19:01:35

For this fits file, if using start-time & end-time in fits header, then only half MDP data packets are counted, which making compression factor double; if using raster duration and study duration to get end-time, both will get completed data volume and generate reasonable compression factor.


However, at this stage, not sure which one is not correct: time in fits header or time in MDP packet curve?

This mismatch, ie. the data coming is slower than the expection, might explain MDP data recorder full on-board, for example, in the case that EIS data packets for previous and current studies are coming very close in time!


There is a structure array to store all related information for the EIS data investigated here. The array has element with the following format:
compFactor={compression_factor, $
          study_ACR     :'', $  ;string
          study_id      :'', $  ;string
          rast_ACR      :'', $  ;string
          rast_id       :'', $  ;string
          ll_ACR        :'',$   ;string
          ll_id         :'',$   ;string
          start_time    :'', $  ;string
          end_time      :'', $  ;string
          fitsname      :'',$   ;string
          target        :'',$   ;string
          sci_obj       :'',$   ;string
          slit          :'',$   ;string
          def_volume    :0LL,$  ;long64 int, unit: bits
          mdp_volume    :0.0,$  ;float, unit: kbits
          comp_scheme   :0,$    ;int
          nexp          :0,$    ;int
          rast_req      :0,$    ;int
          exposures     :fltarr(8) $    ;float, unit: sec
        }

I attached an IDL sav file(info). You may download and play it, for example, I use:

if (str1[i].SCI_OBJ eq 'QS') && (str1[i].COMP_SCHEME eq 1) && (str1[i].MDP_VOLUME gt 0.) then ind[i]=1

to extract records associated with 'QS' SCI_OBJ and using DPCM compression scheme.


JianSun (MSSL) - 2008-06-09