Particle physics @ Utopian-io - Designing an LHC search strategy for unraveling new phenomena

in #utopian-io6 years ago (edited)

The first chapter (spread over six posts) of my detailed notes on how to contribute to state-of-the-art research in particle physics with Utopian.io and Steem is about to be finalized, this post being the last building block of my introduction to the MadAnalysis 5 framework.

Note that everything beyond the first paragraph may be obscure to anyone who did not read the first 5 posts ;)


[image credits: Geni (CC BY-SA 4.0)]

This project runs on Steem for a couple of months and aims to offer a chance to developers lying around to help particle physicists in the study of theories extending the Standard Model of particle physics.

All those potential new theories are largely motivated by various important reasons. However, determining how current data from the Large Hadron Collider (the LHC) at CERN constrains this or that model is by far not trivial.

This often requires the simulation of LHC collisions in which signals from given theoretical contexts are generated. One next reanalyzes the LHC results to evaluate data compatibility with those signals. As no hint for any new phenomena has been observed so far, it is clear that one must ensure that any signal stays stealthy.

The above-mentioned tasks require to mimic LHC analysis strategies, and this is where externals developers can enter the game. In this project, I have introduced the MadAnalysis 5 framework, in which C++ implementations of given search LHC strategies have to be achieved.

Providing these C++ codes is where anyone can help!

Of course, there is still a need to detail how to validate those contributions, but this topic is left for after the summer break.


THE MADANALYSIS 5 DATA FORMAT: 5 CLASSES OF OBJECTS

Let me first quickly recapitulate the post important pieces of information provided in the earlier posts, after recalling that installation information can be found in this first post.

In the next two posts (here and there), I introduced the four of the five different classes of objects that could be reconstructed from the information stored in a detector:

  • Electrons, that can be accessed through event.rec()->electrons() in MadAnalysis 5. This consists in a vector of RecLeptonFormat objects.
  • Muons, that can be accessed through event.rec()->muons() in MadAnalysis 5. This consists in a vector of RecLeptonFormat objects.
  • Photons, that can be accessed through event.rec()->photons() in MadAnalysis 5. This consists in a vector of RecPhotonFormat objects.
  • Jets, that can be accessed through event.rec()->jets() in MadAnalysis 5. This consists in sa vector of RecJetFormat objects.


[image credits: rawpixel (CC0)]

Surprize surprize: taus are there as well, i.e., the big brothers of the electrons and the muons. It is never to late to introduce them!

They can be accessed as any other guy, through event.rec()->taus()which returns a vector of RecTauFormat objects.

All these different objects have a bunch of properties than can be used when designing an analysis. Those properties are often connected to the tracks left, transversely to the collision axis, by the object in a detector.

One has for instance the object transverse momentum (object.pt()), transverse energy (object.et()) and pseudorapidity (object.eta() or object.abseta() for its absolute value) that somehow describe the motion of the object in the detector.

Finally, analyses are often making use of the energy carried away by some invisible particles. This is in particular crucial for what concerns dark matter, as dark matter is invisible. This ‘missing energy’ can be accessed through

  MALorentzVector pTmiss = event.rec()->MET().momentum();
  double MET = pTmiss.Pt();

that respectively returns a four-vector and a double-precision number.


THE MADANALYSIS 5 DATA FORMAT: OBJECT ISOLATION


[image credits: Todd Barnard (CC BY-SA 2.0)]

Object separation is something primordial to ensure a clean reconstruction in which two specific objects do not leave an overlapping signature in the detector.

The command object1.dr(object2) allows us to evaluate the angular distance between two objects, which is then often imposed to be rather large for quality reasons.

MadAnalysis 5 moreover allows to automatically handle the overlap between jets and electrons through the command PHYSICS->Isol->JetCleaning(MyJets, MyElectrons, 0.2). This returns a cleaned jet collection.

In my last post, I detailed how the CMS experiment was dealing with object isolation. The detector activity within a given angular distance around an object is this time evaluated through

  PHYSICS->Isol->eflow->sumIsolation(obj,event.rec(),DR,0.,IsolationEFlow::COMPONENT)

and then constrained. DR indicates here the angular distance around the object obj to consider in the calculations, and COMPONENT has to be TRACK_COMPONENT (charged particles), NEUTRAL_COMPONENT (neutral hadronic particles), PHOTON_COMPONENT (photons) or ALL_COMPONENTS (everything).


SIGNAL REGIONS (ANDMORE WORDS ABOUT HISTOGRAMS)


[image credits: Pixabay (CC0)]

A specific experimental analysis usually focuses on the search for a single signal.

However, it may be possible that such a signal could be probed by several similar analyses, all sharing a common ground and featuring small differences.

One often gather all these ‘sub-analyses’ as a collection of ‘signal regions’ of a single analysis.

As already briefly mentioned in the previous episode, signal regions can easily be declared in the Initialize method of the analysis code by including

  Manager()->AddRegionSelection(“region1”);
  Manager()->AddRegionSelection(“region2”);
  Manager()->AddRegionSelection(“region3”);
  …

Once regions are defined, histograms, introduced in the previous post in the context of a single existing region, can be attached to one or more regions through the commands

  Manager()->AddHisto("histo1",15,0,1200);
  Manager()->AddHisto("histo2",15,0,1200,”region1”);

  std::string reglist[ ] = { "region1", “region2”};
  Manager()->AddHisto("histo3",15,0,1200,reglist);

that must also be included in the Initialize method. The first command attaches a histogram (named histo1) to all existing regions, the second one defines a histogram (named histo2) attached to the region region1, and the third command finally links a third histogram histo3 to the two regions region1 and region2. All histograms are, in this example, made of 15 bins ranging from 0 to 1200.

They are then filled at the level of the Execute method of the analysis code,

  Manager()->FillHisto(“histo1”,value)

where value is the weight to be filled in the histogram. This weight is appropriately handled when adding
Manager()->InitializeForNewEvent(event.mc()->weight());
at the beginning of the Execute method.

The digitized version of the histograms can be found in the Output/tth/test_analysis_X/Histograms/histos.saf saf file, where one assumes that the code has been executed on a signal n event named tth.list.

Figures can then be generated by using either the code of @crokkon, of @irelandscape or of @effofex (on GitHub).


IMPLEMENTING A SELECTION STRATEGY


[image credits: Counselling (CC0)]

It is now time to jump with the description of the last ingredients necessary for reimplementing an analysis: cuts (connection with the mower ;) ).

An event selection strategy can be seen as a sequence of criteria (or cuts) deciding whether a given event has to be selected or rejected, the aim being killing the background as much as possible and leaving the signal as untouched as possible.

A cut can hence be seen as a condition which, if realized, leads to the selection of an event. Cuts must first be declared in the Initialize method, in a similar way as for histograms,

 Manager()->AddCut("cutname1”);
 Manager()->AddCut("cutname2","region1");
 Manager()->AddCut("cutname3”,reglist);

The first line introduces a common cut (named cutname1) that is applied to all regions. The second line declares a cut specific to the region region1 and the third line introduces a cut that is common to the two regions region1 and region2.

At the level of the Execute function, the application of a cut has to be implemented as

   if(!Manager()->ApplyCut(mycondition, "cutname1")) return true;

where mycondition is a boolean that is true when the event has to be selected. The second argument is simply the name of the cut under consideration. Moreover, cuts have to be applied in the order they have been declared.


[image credits: arXiv]

The evolution of the number of events surviving the different cuts is what we call a cutflow, an example being shown with the image on the right.

The corresponding information can be found in the Output/tth/test_cms_46/Cutflows directory, tth.list being once again the input file.

In this directory, one SAF file per region is available. For each cut, the line sum of weights is the relevant one to look at for getting the number of events surviving a given cut.


THE EXERCISE

The exercise of the day is pretty simple. We will focus on the earlier CMS search for dark matter and implement the full analysis. We will restart from the piece of codes of the last time and implement the cuts provided in the Section 3 of the CMS article. We will then apply the code to our usual even sample and get a cutflow.

Hint: the azimuthal angle between the missing energy and any object can be computed as object->dphi_0_pi(pTmiss)) with pTmiss being introduced above.

Don’t hesitate to write a post presenting and detailing your code.

  • If tagged with utopian-io and blog as the first two tags (and steemstem as a third), the post will be reviewed independently by steemstem and utopian-io.
  • If only the steemstem tag is used, the post will be reviewed by steemstem and get utopian-io upvotes through the steemstem trail.

The deadline is Tuesday July 31st, 16:00 UTC.


MORE INFORMATION ON THIS PROJECT

  1. The idea
  2. The project roadmap
  3. Step 1a: implementing electrons, muons and photons
  4. Step 1b: jets and object properties
  5. Step 1c: isolation and histograms
  6. Step 1d: cuts and signal regions (this post)

Participant posts (alphabetical order):


STEEMSTEM

SteemSTEM is a community-driven project that now runs on Steem for almost 2 years. We seek to build a community of science lovers and to make Steem a better place for Science Technology Engineering and Mathematics (STEM). In particular, we are now actively working in developing a science communication platform on Steem.

More information can be found on the @steemstem blog, in our discord server and in our last project report.

Sort:  

Everytime I read a post from @lemouth I know I'll read it twice and understand less than half of it.

Still, its good to read up on fields completely different than that one I've been trained in and this is about as far away from molecular biology as I can think of.

Would the term "simulated theoretical physics" be appropriate for what you're doing here?

Same thing, don't understand that much, but love to read his writings nevertheless. ;-)

Please check my answer to @tking77798 :)

The problem is that here, if you are not "into the project" (i.e. having read the 5-6 preceding posts and having their content in mind), anything beyond the first paragraph is probably obscure. I cannot summarized the previous episode in more details for the simple reason of the post becoming just too huge.

For this project, simulations matter but they will come later. What we aim, at the end of the day, is to simulate any signal for any theory and check how well (or how bad) the LHC could find them.

Does it make things clearer? :)

as dark matter is invisible

I guess that's one of the reasons it is called "dark" matter.

There's something I've been thinking; though not directly related to the post - or maybe remotely. How close are scientists to discover dark matter? (I mean, to observe dark matter directly). Does it actually exist in reality?

I stumbled on a publication that talked about the discovery of some high energy "ghost particle". Does that account for the discovery of dark matter?

I guess that's one of the reasons it is called "dark" matter.

Yes: it does not interact electromagnetically. Since electromagnetism is connected to light, it is called dark :)

There's something I've been thinking; though not directly related to the post - or maybe remotely. How close are scientists to discover dark matter? (I mean, to observe dark matter directly). Does it actually exist in reality?

There is a bunch of indirect evidence for dark matter, and they are very strong. The only missing point is a direct observation. However, the results have not ruled out all dark matter options (we are even far from there).

But I cannot really answer the question. I hope we will now as soon as possible :)

I stumbled on a publication that talked about the discovery of some high energy "ghost particle". Does that account for the discovery of dark matter?

I have no idea on what you are talking about. Can you please provide me a reference?

Thanks a lot for the kind reply sir.

Can you please provide me a reference?

I found one of the articles, but I can't locate the rest. I should have bookmarked them; but maybe because I didn't show much interest in them.
Here's the link to the one I found:

Those are neutrinos. I wrote two posts on them recently, and I will actually write something on them soon to discuss the discovery you mention :)

I'd be waiting for it sir :)

Maybe tomorrow. No time at the moment ;)

My hat is off to @crokkon in particular, excellent work on the histogramming.

Yours was good too. Do you plan to write a post on it? :)

@lemouth I am thinking of it - I'm not sure if I'll do a utopian #developer style post, as I really think @crokkon's should take center stage. I may do a steemstem post about it or a utopian #blog post focusing more on approaching matplotlib as a ggplot2 person.

You can do it either as a steemstem post (recommended) or a utopian-io blog post. I won't do make it a developer entry.

As usual, all of your initiatives are nice and insightful for all the particle physics community and not only as I get to see.
Too bad there are not too many like you. Have you thought of teaming up with other "you's" from the multiverse? :)
How could you signal them to get them to cooperate?

I will check the participant posts now, I still need to concentrate a bit to understand but this keeps me open to what you keep saying.

Like Dr Harrison Wells from the movie "Flash".. S.T.A.R Labs, Earth 2 :)

Earth 81 in my case ;)

I would actually need a lot of mini-me's :D

Please check the participant posts. They are great :)

This is in particular crucial for what concerns dark matter, as dark matter is invisible.

Do we know that dark matter is 100.00000% invisible?

Could it be 99.9999% invisible? That is, the electromagnetic coupling constant is small but still there? (I have such a shit memory, I may have already asked you this).

Do we know that dark matter is 100.00000% invisible?

It depends on the definition of invisible. Dark matter is invisible at the level of an LHC detector. It will leave no track in it. We will just 'see' something missing (deduced from energy conservation). Dark matter is also not electromagnetically interacting. In this way, it could be seen as invisible as electromagnetism is connected to light.

I assume I more or less answered the second question too, didn't I? :)

Hello hello!

Finally, analyses are often making use of the energy carried away by some invisible particles. This is in particular crucial for what concerns dark matter, as dark matter is invisible.

So I get that all energy missing in the transverse plane is obviously measured and then linked to the existence of dark matter, providing scientists with the ability to create a dark matter model of some sort?

I admire your efforts, this is such an incredible initiative!

Have a good day!

So I get that all energy missing in the transverse plane is obviously measured and then linked to the existence of dark matter, providing scientists with the ability to create a dark matter model of some sort?

Not necessarily. Neutrinos are also invisible. Also, mismeasurements induce fake missing energy too.

To summarize, a large amount of missing energy could be related to dark matter, but the Standard Model and experimental effects contribute too (they make the background).

Hi, trying to work on this early as I'm soon on holiday.
I have a bit of an issue because at the moment none of the events in the sample file seems to meet the Ptmiss > 170 requirement.

As per your instruction the code is as follows:

  if(!Manager()->ApplyCut(
    event.rec()->MET().momentum().Pt() > 170,
    "pTmiss"))
  {
    return true;
  }

Am I being wrong about the check?

Thanks!

No, the sample is not very appropriate for this analysis. It consists in the production of a pair of top quarks with a Higgs boson. This is miles away from dark matter ;)

Ok that makes sense.
So do you have any sample to run the code against or should we just provide code that compiles?

I will generate a relevant sample, hopefully tonight or tomorrow. That is a very good idea!

Hi @lemouth,
I have posted my solution for exercise 1d.
I know I haven't had a chance to test it against a sample file but I'm going abroad in a couple of days which wouldn't give me much time to apply any correction to the code.

So at least the code is available for review if you have the time.
I hope that's OK.

Cheers,
Olivier

I will try to do my best to provide you a test sample tomorrow. Not too sure if I will manage, but I will try. I will review the code in any case :)

When will you be back?

I'll be back on the 30th of July. Leaving this Friday. Not sure yet if I take my laptop with me.
I have this urge to write at the moment.
That might be a problem with my wife though. 😉

Don't worry, I will still be there at your return. Please take your time :)

Do I need a professor to understand all this? A humble request is if you can add what happened and going to happen as a conclusion or use common word while writing the post, lay man like me would be able to digest more of this info.

You probably need to read the first 5 posts of the series. Otherwise, there is little chance to get anything going beyond the first paragraph. By the way, what you asked has been already provided in the first posts of this series (I just don't want to repeat here, as the post is already long enough).

Thanks for help ... I tried again and failed again, never been a good pupil. I think this is hazardous for my health :p

Please ask any question about where you are stuck (I know, this stuff is a bit technical but this is necessary for the project). You can also read @irelandscape blog where he explains how he understands the topic.

@lemouth I would love to pass this, for now, :p and waiting for your next post to test if I can unlock the code hidden in your writing :)

The next one will be easier. I may write it tonight. I have an important news to discuss ;)

My spy eyes are on you and your post ;)

Okidooki! I like being spied ;)

Hi looking forward to work on this before a short break for the summer holiday.
Will be away from the end of next week so this is perfect timing.
Cheers!

I am looking to read your contribution, as usual ;)

Hey @lemouth
Thanks for contributing on Utopian.
We’re already looking forward to your next contribution!

Want to chat? Join us on Discord https://discord.gg/h52nFrV.

Vote for Utopian Witness!

Coin Marketplace

STEEM 0.30
TRX 0.12
JST 0.033
BTC 64093.86
ETH 3123.80
USDT 1.00
SBD 3.94