Monte Carlo production tools
  • Introduction
  • Monte Carlo production overview
  • Monte Carlo Management (McM): introduction
  • Analyzer's corner
    • Monitoring submitted requests
    • How to search for datasets in DAS and McM
    • How to find the fragment of a request in McM
    • How to find the cmsDriver used for a certain request
    • How to use randomized parameters samples
  • Monte Carlo contact's corner
    • Rules for dataset names
    • Rules for Run3 dataset names
    • Rules for Run3 2024 dataset names
    • How to create a new ticket
    • High priority requests
    • Scripting in McM
    • Request checking script
    • News and current request policy
    • Interactive MC contact exercises
    • Randomized Parameters
    • Info for MC production for Ultra Legacy Campaigns 2016, 2017, 2018
    • Info for MC production for Run3 Campaigns
  • Request manager's corner
    • [DRAFT] MultiValidation in McM
    • Procedure how to create and setup a campaign
    • Fast Simulation Campaigns
    • "Dead" requests and tickets
  • Computing's corner
    • Status of requests in computing
    • Errors in production: explanation
    • Task chain vs step chain
  • cmsDriver argument and meaning
    • runTheMatrix and release validation
  • Monte Carlo Management (McM): detailed guide
    • McM Glossary: requests
    • McM glossary: chained requests
    • McM glossary: campaigns
    • McM glossary: flows
    • McM glossary: tickets
  • Production Monitoring Platform (pMp): detailed guide
  • Data reprocessing (old injection method via script)
  • FAQ
  • Contacts
  • Group Analysis Samples Page: GrASP
    • Tagging on GrASP
Powered by GitBook
On this page
  • Preliminary steps
  • A general flow-chart of MC request submission (courtesy of D. Sheffield)
  • 1- Preparing LHE files
  • 2 - Submitting a request in McM
  • 3 - Checking status
  • 4 - Additional things

Was this helpful?

Monte Carlo contact's corner

In this page, we collect useful information for Monte Carlo contacts about how to act in McM

PreviousHow to use randomized parameters samplesNextRules for dataset names

Last updated 3 years ago

Was this helpful?

Preliminary steps

Let us know your name! When we have the names of all MC subgroup contacts we will create a mailing list.

Register to McM using these instructions (for generator contacts, not as a normal user):

Ask the generator conveners the permission to upload LHE files to EOS disks.

Be subscribed to the following HNs:

  • to actively use:

  • to ask questions:

  • just to monitor generation status: , l

A general flow-chart of MC request submission (courtesy of D. Sheffield)

Flow chart of Monte Carlo request submission

1- Preparing LHE files

Many generators (POWHEG, MadGraph5_aMCatNLO etc.) use LHE files as a starting point for generation. If you use a generator that does the ME step on the fly (e.g. bare Pythia) just skip this part.

  • If you need help to prepare Madgraph5_aMCatNLO LHE files, you can contact the cms Madgraph team using the generators hypernews.

There are two ways to technically produce LHE files:

  • In most cases, you should use campaigns called "xxxwmLHE" to request a central production of the LHE. It can also help to keep track of the sample.

  • Only if this is not possible or analysts provide LHE files directly, you can do it "privately", then you can use campaigns called "xxxpLHE"

P.S: It is strongly recommended to check the LHE files locally before uploading, so that we don"t upload the broken LHE files which cause a lot of problem later in production

xmllint --stream --noout ${file}.lhe > /dev/null 2>&1; test $? -eq 0 || fail_exit "xmllint integrity check failed on ${file}.lhe"

1.1 - Using wmLHE campaigns

Only if you want automatic production of LHE files follow these instructions:

  • Supported generators for this option are Madgraph5_aMCatNLO (replaces old MadGraph) and POWHEG v2 (if a process is not yet available in v2, we temporarily accept v1)

  • In order to use this option you need to produce first a "gridpack" (Madgraph5_aMCatNLO) or a executable/grid file tarball (POWHEG). Instructions to produce those can be found at:

  • Generator conveners must approve the pull request, at that point the cards can be used for production of gridpack/tarball.

/store/group/phys_generator/cvmfs/gridpacks/<your scram architecture>/13TeV/<your generator>/<your version>/

e.g.

/store/group/phys_generator/cvmfs/gridpacks/slc6_amd64_gcc481/13TeV/madgraph/V5_2.2.2/
/store/group/phys_generator/cvmfs/gridpacks/slc6_amd64_gcc491/13TeV/powheg/V2/

After a few hours these files will be autmatically copied to the cvmfs repository under:

/cvmfs/cms.cern.ch/phys_generator/gridpacks/slc...  etc. etc.

At this point these are usable for externalLHEProducer. See below. It is possible of course to use already existing gridpacks.

1.2 - Using pLHE campaigns

Only if you have produced LHE privately follow these instructions:

  • Store them in a temporary directory, e.g. tempdir/.

cmsDriver.py MCDBtoEDM --conditions <conditions for that release> -s NONE \
                                              --eventcontent RAWSIM --datatier GEN \
                                               --filein file:/tmp/covarell/h_PG_TT_TTVBF_90_0.lhe --no_exec -n 1
  • change the LHESource PSet to have your files as input (it would be good to run on all of them, if possible, to check for single corrupted files) and run normally.

  • When running the script with cmsRun, if it goes to the end with no exceptions/errors, your LHE files are validated.

cmsLHEtoEOSManager.py -n --compress -f <list of LHE files separated by comma (no blanks!)>

e.g.

cmsLHEtoEOSManager.py -n --compress -f tempdir/h_to_tautau_120_1.lhe,tempdir/h_to_tautau_120_2.lhe,tempdir/h_to_tautau_120_3.lhe

If the compress option does not work, remove it. In particular do not use it if the samples are going to be processed in releases < 5.3.8

std::ofstream outFile("/tmp/covarell/out.lhe", std::ios::out);

and perform the following actions:

g++ -Wall -o mergeLheFiles mergeLheFiles.cpp
ls tempdir/*.lhe > laMiaLista
./mergeLheFiles laMiaLista

and you will get a single LHE file called out.lhe.

  • Make sure you understand which campaign you want to use. Every base name (e.g. Summer12, RunIIWinter15 etc.) corresponds to a well-defined release cycle and set of DB conditions

  • The procedure to follow is different if you have prepared gridpacks/LHE files from the step 1, or if the request does not need a LHE step

2.1 - If starting from LHE files/wmLHE workflows

  • If the original request is very old, use the button "option reset"

  • Edit the cloned request

    • You can check the "Valid" box and set a number of events n, in the range 0 < n <= nTotal. Do it at your own risk, as the validation may crash badly.

    • If this is a "xxxpLHE" request, set "MCDB Id" (= LHE number on eos, see above), otherwise set 0.

    • Put the number of events actually available in the above upload, or less. Never not put more than this number, e.g. if you have uploaded 199800 events in LHE files, do not round to 200000

    • Put the cross-section if people in analysis are supposed to use it! If there are better estimations of total cross-section than the present generator (e.g. at higher QCD orders, or from LHC XS WG etc.), then just put 1.0. Leaving -1 is not OK.

    • For filter and matching efficiencies always put 1 with error 0. Leaving -1 is not OK.

    • Set generators (use the json format that you see in the original request)

    • If this is a "xxxwmLHE" request, put the generator fragment for external LHE production, otherwise leave blank. You can copy it by hand in the big window "Fragment":

import FWCore.ParameterSet.Config as cms

# link to cards:
# https://github.com/cms-sw/genproductions/tree/e30fc9c7d9226a2c96869c0ddbe5e65884afd013/bin/MadGraph5_aMCatNLO/cards/production/13TeV/dyellell012j_5f_NLO_FXFX

externalLHEProducer = cms.EDProducer("ExternalLHEProducer",
    args = cms.vstring('/cvmfs/cms.cern.ch/phys_generator/gridpacks/slc6_amd64_gcc481/13TeV/madgraph/V5_2.2.2/dyellell012j_5f_NLO_FXFX/v1/dyellell012j_5f_NLO_FXFX_tarball.tar.xz'),
    nEvents = cms.untracked.uint32(5000),
    numberOfParameters = cms.uint32(1),
    outputFile = cms.string('cmsgrid_final.lhe'),
    scriptName = cms.FileInPath('GeneratorInterface/LHEInterface/data/run_generic_tarball_cvmfs.sh')
)
    • For time and size per event otherwise just put dummy (= very small) values, e.g. 0.001 s and 30 kB.

  • with MLM jet matching (MadGraph_aMCatNLO at LO)

    with POWHEG emission veto (POWHEG)

    generic hadronizer for LO generators with no jet matching

    • Settings in these files should be adapted to your case (flavor scheme, extra weak particles etc.). In particular if you are using jet matching and the gridpack is new (created by yourself) the qCut must be measured (see step 4 on how to measure it)

  • You should check the "Valid" box and set a number of events n about 30-50.

  • Put the cross-section if people in analysis are supposed to use it! If there are better estimations of total cross-section than the present generator (e.g. at higher QCD orders, or from LHC XS WG etc.), then just put 1.0 (after hadronization, see step 4 on how to measure it). Leaving -1 is not OK.

  • Insert sensible filter and matching efficiencies (see step 4 on how to measure them). Leaving -1 is not OK.

  • Insert sensible timing and size (see step 4 on how to measure them).

  • At that point, if you had activated the validation, you should go back to the request, click on the "Select View" tab above and ticking "Validation": a new column will appear with a link to a DQM page (maybe).

    • If the validation fails, you will be notified by email with the logfile attached. So go back and edit the request to fix it.

  • In the DQM page, navigate to the proper folder and look in the plots that everything is like you expect and define the request (click "next step" again)

2.2 - If not starting from LHE files

  • If the original request is very old, use the button "option reset"

  • Edit the cloned request

    • You should check the "Valid" box and set a number of events n, in the range 30-50.

    • Add cross-section (see step 4 how to measure it). Leaving -1 is not OK.

    • Add filter and matching efficiencies (see step 4 on how to measure them), if unity, put 1 with error 0. Leaving -1 is not OK.

    • Set generators (use the json format that you see in the original request)

    • Set time and size per event (see step 4 on how to measure them).

  • At that point, you should go back to the request, click on the "Select View" tab above and ticking "Validation": a new column will appear with a link to a DQM page (maybe).

    • If the validation fails, you will be notified by email with the logfile attached. So go back and edit the request to fix it.

  • In the DQM page, navigate to the proper folder and look in the plots that everything is like you expect and define the request (click "next step" again)

3 - Checking status

To check the status of a submitted request:

  • Go to the request, click on the "Select View" tab above and ticking "Reqmgr name": a new column will appear with some links ("details", "stats", etc.). Click on the small eye close to the links to see a graph of the sample status.

4 - Additional things

There are some quantities you have to measure so that the production team knows the properties of the request, to go ahead with the central production. Filter efficiency can be measured while running just the GEN step, while the timing and size are the time needed to run one GENSIM event and its size.

To start this procedure go on the "Get setup" button of the request and check that the output is correct (N.B. for this you need to upload the generator decay fragment before!).

Measure filter efficiencies and cross-section

You need this step if:

  • you want to measure a cross-section after hadronization (for backgrounds, for signal there are precise estimates from the LHC cross-section WG already)

  • you have filters on final state particles, so filter efficiency is smaller than 1

  • you have jet matching cuts, so matching efficiency is smaller than 1

If you don't need this step all numbers above are 1, just proceed to "Measure time and size".

source CmsDrivEasier.sh <request or chained request McM ID> <n> 1 

e.g.

source CmsDrivEasier.sh HIG-Fall13-00001 10000 1     or
source CmsDrivEasier.sh SMP-chain_Summer12WMLHE_flowLHE2S12_flowS12to53-00008 10000 1 

where n is a sufficient number of events to measure precisely the efficiency, e.g. if the efficiency is expected to be o(1/100) you could use 100000 events. E.g.

In the output look for the string "GenXsecAnalyzer". Below this you have a detailed logging of the needed quantities.

Measure proper value of qCut

You need this step if you have jet matching cuts.

root [0] .x plotdjr.C("/tmp/<your username>/<request ID>.root","test.root")

test.root will contain a canvas which shows plots of differential jet distributions separated according to the number of original partons in an event. Of course histograms and plots do not match perfectly because additional jets are created by the parton shower: in fact, for the plot corresponding to the i-th jet, you have a sharp drop at pT = qCut of the sample with i-1 partons and a sharp rise of the sample with i partons.

  • You should check that the TOTAL jet rate distributions appear smooth at pT ~ qCut.

  • If there is a kink somewhere, try with a larger qCut, if there is a dip try with a smaller one.

  • Repeat until you find smooth distributions.

Measure timing and size

source CmsDrivEasier.sh <request or chained request McM ID> <n> 2 

e.g.

source CmsDrivEasier.sh HIG-Fall13-00001 30 2     or
source CmsDrivEasier.sh SMP-chain_Summer12WMLHE_flowLHE2S12_flowS12to53-00008 30 2

where 30 (or 30/filter efficiency if filter efficiency is not 1) if is usually a sufficient number of events to measure time and size

  • In the end of the output, you get the needed statistics:

    • AvgEventTime is the value to fill for time/event (in sec)

Validation failure because of non existing wmLHE dataset

  • On the first box, you will see that dataset. You can put all datasets you need in the same box.

  • Choose T2_CH_CERN.

  • Click Accept.

  • Please ping prep-ops HN, so we can approve quickly.

See the dedicated section of for Matrix Element generators

Before starting the execution of the gridpack/tarball, run_card.dat or proc_card.dat cards (for Madgraph5_aMCatNLO) or powheg.input cards (for POWHEG) should be uploaded to the following repositories (to do that use instructions ):

Exception: if you use cards provided in the or in the , without any changes, there is no need to upload them in the production area

When you have the gridpack/tarball, this should be copied to this area of (cmsStage), after creating a proper subdirectory (cmsMkdir):

In any CMS release, produce a script while using the following command, which aims at translating the LHE into a roottuple:

Your LHE files should go in a new directory named as the first available number on /store/lhe. To do this, use the script: . The command format is:

If you have a lot of LHE files with a small number of events in them (typical output of batch queue processing) it may be a good idea to first merge them in a single or a few files. To do this, download the script , modify the following line defining the output merged file:

2 - Submitting a request in

Complete instructions on how to use this tool are . These simple actions are needed to submit a sample:

Go to

You must start from a campaign called xxxpLHE or xxxwmLHE. To create a new request in those campaigns you can follow , but in practice in most cases it is simpler to clone an existing sample in a campaign:

Use

Example of xxxwmLHE fragment is here (just change the tarball name, commented link to cards with specific github revision, eventually the nEvents per job, leave the same configuration for the rest). Note that for gridpacks, the git revision information is also written in the gridpack_generation.log inside the tarball if in doubt.:

Put the dataset name following

Report at the MC coordination meeting (thursday 3pm) that the request is ready (if the request is very urgent tell directly ). The request is accepted (or you will be asked to revise it) and prioritized. At this point production managers will "chain" the request, i.e. the system will create automatically a GENSIM request in the campaign with the same xxxGS name (e.g. RunIIWinterpLHE or RunIIWinter15wmLHE will create an entry in RunIIWinter15GS). This request will not go on automatically so it is up to you to edit the new request:

First of all, put the generator fragment for showering/hadronizing/decaying your MC sample. You can copy it by hand in the big window or you give the name and fill in "Fragment tag" the corresponding package tag in github. If you need to upload a new one in github, follow instructions . A list of already available fragments is . Possible associations:

with jet matching (MadGraph_aMCatNLO at NLO)

Move the request to the next step (chain validation). In the action box close to the request name, click on "chained request", then in the new page that will appear click on "validate chain" in the action box: when this is completed you will receive a mail from .

You must start from a campaign called xxxGS, e.g. RunIIWinter15GS. To create a new request in those campaigns you can follow , but in practice in most cases it is simpler to clone an existing sample in a campaign:

Use

First of all, put the generator fragment for generating your MC sample. You can copy it by hand in the big window or you give the name and fill in "Fragment tag" the corresponding package tag in github. If you need to upload a new one in github, follow instructions .

Put the dataset name following

Move the request to the next step (validation) (it's the ">" small symbol besides the request name, in the "view" panel): when this is completed you will receive a mail from .

After, report at the MC coordination meeting (thursday 3pm) that the request is ready (if the request is very urgent tell directly )

Use the script in this way:

Do all steps of the paragraph above, using a lot of events and a reasonable starting qCut: this should be about 1.5-2 times larger than the ptj or xqcut parameter used in ME generation. You should find a root file in the end in /tmp/<your username>/<request ID>.root . Then download and run in ROOT as:

Use the script in this way:

Timing-tstoragefile-write-totalMegabytes divided by TotalEvents is the size/event (in MB, in you have to change to kB)

If a GENSIM request is chained to a wmLHE request that was produced far in the past, validation may fail because it is run at CERN, but files of the LHE dataset could have been moved elsewhere in the meantime. In that case you need to request a transfer.

You click on one "input dataset" you need from GS request page (or output dataset in wmLHE request page), it will direct you to the page.

Click on "Subscribe to ".

Choose .

Add reason, i.e. For validation.

this Twiki
Madgraph5_aMCatNLO
POWHEG
here
EOS
EDM
mergeLheFiles.cpp
McM
here
creating instructions
cloning instructions
madgraph_aMC@NLO
these rules
hn-cms-prep-ops@cernNOSPAMPLEASE.ch
McM
creating instructions
cloning instructions
here
these rules
McM
hn-cms-prep-ops@cern.ch
CmsDrivEasier.sh
CmsDrivEasier.sh
McM
PhedEx
McM
PheDeX
PhEDEx
AnalysisOps
McM
generator at 13
TeV
fragment
FxFx
Monte Carlo contacts
https://twiki.cern.ch/twiki/bin/viewauth/CMS/McM#Register
cms-phys-conveners-GEN@cernNOSPAMPLEASE.ch
https://hypernews.cern.ch/HyperNews/CMS/get/prep-ops.html
https://hypernews.cern.ch/HyperNews/CMS/get/generators.html
https://hypernews.cern.ch/HyperNews/CMS/get/datasets.html
https://hypernews.cern.ch/HyperNews/CMS/get/dataopsrequests.htm
https://hypernews.cern.ch/HyperNews/CMS/get/comp-ops.html
production card repository for MadGraph5_aMCatNLO
production card repository for POWHEG
example card repository for MadGraph5_aMCatNLO
example card repository for POWHEG
GeneratorInterface/LHEInterface/scripts/cmsLHEtoEOSManager.py
McM
here
here
this macro
link
link
link
link