معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Module for reading EDGES data and working with EDGES databases.

ویژگی	مقدار
سیستم عامل	-
نام فایل	edges-io-4.1.3
نام	edges-io
نسخه کتابخانه	4.1.3
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	EDGES Collaboration
ایمیل نویسنده	steven.g.murray@asu.edu
آدرس صفحه اصلی	https://github.com/edges-collab/edges-io
آدرس اینترنتی	https://pypi.org/project/edges-io/
مجوز	mit

======== edges-io ======== .. image:: https://travis-ci.org/edges-collab/edges-io.svg?branch=master :target: https://travis-ci.org/edges-collab/edges-io .. image:: https://codecov.io/gh/edges-collab/edges-io/branch/master/graph/badge.svg :target: https://codecov.io/gh/edges-collab/edges-io .. image:: https://readthedocs.org/projects/edges-io/badge/?version=latest :target: https://edges-io.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status **Module for reading EDGES data and working with EDGES databases.** This package implements all necessary functionality for reading EDGES data. It's two main concerns are: 1. Reading the various file formats required for EDGES data: - VNA readings - fastspec output - thermistor readings - field weather recordings - field thermlog recordings 2. Verifying and exploring databases of measurements in a robust and reliable way. Features ======== Some features currently implemented: * Verify a "calibration observation" quickly without reading any actual data, with a nice command-line tool: ``edges-io check``. * Optionally apply various automatic _fixes_ to a calibration observation to bring it into line with standard database layout. * Read ``acq``, ``h5``, ``mat`` and ``npz`` spectrum files seamlessly. * Read S1P files. * Verification of read data. * Intuitive class hierarchy so that any subset of an observation can be handled. * Read field-based weather and thermlog information Installation ============ Installation should be as simple as either one of the following:: $ pip install git+git://github.com/edges-collab/edges-io or, if you would like to develop ``edges-io`` and use it too:: $ git clone https://github.com/edges-collab/edges-io $ cd edges-io $ pip install -e .[dev] There are a few dependencies, which should be installed automatically when following the above command. If you are using ``conda`` (which is recommended) then you can obtain a cleaner/faster install by doing the following:: $ conda create -n edges python=3 $ conda activate edges $ conda install numpy scipy h5py And then following either of the above instructions. Usage ===== You can use ``edges-io`` either as a library or a command-line tool. The library is self-documented, so you can look at the docstring of any of the available functions. We describe some basics of each approach here. CLI --- To run the checking tool, simply do:: $ edges-io check PATH ``PATH`` should be the top-level directory of a calibration observation (i.e. a folder that has a sub-folder ``25C/``, which has subfolders ``Spectra/``, ``Resistance/`` and ``S11`` etc.). There are a few options you can use, for example changing the temperature of the observation, and enabling automatic fixes. The latter can be achieved simply with the ``--fix`` flag. If you find that a particular kind of error happens regularly, `make an issue <https://github.com/edges-collab/edges-io/issues/new>`_ so we can add the fix. Library ------- The library is useful for gathering an entire observation and performing operations on its data. The library exposes a hierarchy of calibration objects, including base objects like a ``Spectrum``, ``Resistance`` or ``S1P`` file, and container objects like ``Spectra`` or ``S11``. An entire observation can be loaded as a ``CalibrationObservation``, and it contains references to all children. For example:: >>> from edges_io import io >>> obs = io.CalibrationObservation("path_to_observation") >>> print(obs.s11.path) "path_to_observation/25C/S11" >>> print(obs.spectra.ambient.path) "path_to_observation/25C/Spectra/Ambient_XXX.acq" >>> ambient_spectrum = obs.spectra.ambient.read() See how ``edges-io`` is used in `edges-cal <https://github.com/edges-collab/cal_coefficients/tree/master/src/edges_cal/cal_coefficients.py>`_ for a more involved example. Defining Observations --------------------- One of the main goals of ``edges-io`` is to make the definition of a "Calibration Observation" as clear, robust and error-free as possible. Many files go into any particular observation -- spectra, resistance measurements and S11 measurements -- which are all required to form together a calibration solution (which can then be applied independently to field data). This code provides a clear structure for how these files *must* be laid out in order for them to be read and used automatically. This is done in a formal sense in `this document <docs/structure.rst>`_, but is also implemented within the code itself. In the above document, the specification is laid out as formally as possible, and that document has the final word on what is allowable. However, this can mean it's a bit hard to interpret, and so we here present a "simpler" guide to what constitutes a "Calibration Observation". A single natural "Observation" (see below for how to combine multiple observations into a single "virtual" observation) is a single directory with multiple files/subdirectories in it. That directory must be named under a certain convention that time-stamps it and gives some useful metadata of the observation (like which receiver number was measured). It is possible that in the future, a metadata file within the observation will specify most of this information, but it is also useful to have a unique label for the observation. One question that is important in all of this definition is what to do when either 1) a file exists that shouldn't be there, or 2) a file doesn't exist that *should* be there. It may be tempting to overlook extra files that shouldn't be there. However, they can be a source of error. For example, spectra can be split across multiple files, and we use a file pattern to find the files that should be read in. If an extra file that "shouldn't be there" exists and the file pattern matches it, then errors can occur (even worse if the contents of that file are able to be read by the spectrum reader, but correspond to a different load or something of that sort, where the results will be wrong, but no error raised). Thus, when *checking* the integrity of an observation we flag extra files as *errors*, and require the user to fix them up. To make this a bit easier, and let those files stay in the directory (so we don't lose potentially valuable information), one of a few extensions can be added to the extra file: * ``.old``: for files that contain valid data but that is superseded by newer measurements and should be ignored, * ``.invalid``: for data that has something wrong with it (equipment broken, wrong input parameters, etc.), * ``.ignore``: files to ignore for any other reason. If the file does not have one of these extensions, and is not in the list of accepted files for the Observation, an error will be raised by the checker. On the other hand, if a file is *missing* that must be there, different things can happen in different situations. The default case is to treat this as a warning, which may be counter-intuitive (surely missing a required file should be an error?!). The reason for this is that that file may be supplemented by a different Observation. Perhaps this Observation is incomplete -- maybe all the data that was taken was a single set of Spectra, which is supposed to complement a previous observation which had a full set of measurements. In this case, while the "natural" Observation is incomplete, it is not necessary to give an error, as long as a warning is given such that it must be combined with another observation. Nevertheless, some combinations of files are required to have been taken in the same physical observation to ensure consistency (namely, S11 measurements for each standard in a given load). If particular standards are missing, an error will be raised. These caveats should be kept in mind as we talk about "required" directories/files below. "Required" will mean that after combining all the observations that we want/need (see next section), we require this particular file. Within the top-level observation directory are a number of directories denoting the ambient temperature at which the observation was taken. These will usually be 15C, 25C or 35C. Most newer observations are at 25C. One should never mix files between different ambient temperatures. Thus, in reality, an observation is contained within one of *these* folders, and in practice, the ``CalibrationObservation`` has its ``path`` attribute set to the temperature directory. Inside this directory can be up to two files, and exactly three folders. One of the files is a ``Notes.txt`` file which summarises human-readable notes about the observation ("we ran the ambient spectra first, but had a delay because of xxx..."). The other file is named ``definition.yaml`` and includes metadata about the observation in a specific format (this file also allows you to supplement the observation with other observations, but we'll get to that later). Measurements/data like the male/female resistance should be put in here (til now they have been found somewhere an input manually by the analyst when doing the calibration, which is very risky and prone to error -- they are properly part of the measurement itself, not a choice of the analyst). The three folders are ``Spectra``, ``Resistance`` and ``S11``. Note that an observation must have all three of these (and nothing else, after combining observations). Within ``Spectra`` exist a bunch of spectra taken over about 12-24 hours for each of four "calibration sources" in the lab: they are "Ambient", "HotLoad", "LongCableOpen" and "LongCableShorted" (often referred to as their simple aliases "ambient", "hot_load", "open" and "short" in the code). These spectra will be in either ``.acq`` or ``.h5`` format, depending on the version of ``fastspec`` that took the measurements. Due to the way ``fastspec`` takes its data, each source may have multiple files for a single measurement (each integration is saved to a new line in the file, but a new file is created at particular local times each day). Thus, typically one would like to read in and concatenate _all_ the files for that load, to use all the data. Beyond this, it is _possible_ that two fully separate "runs" for a given source/load will be made. In this case, an identifier for the "run number" is put into the filename. Only one run number is actually used to do any particular calibration. In practice, it is *very rare* to have more than one "useable" run number for any particular load. Typically, a second run is only taken if it is deemed necessary due to the first being invalid in some way. If this is the case, this should be noted in the ``Notes.txt`` and/or the ``definition.yaml``. The ``Resistance`` folder is almost exactly the same as the ``Spectra``. Each of the sources is represented here again (with the same names), and the filename format is the same, except that the files themselves are all ``.csv``. These measure the resistance readings of the sources, which are used to derive the physical temperatures of the loads (against which the spectra are calibrated). Again, each source is allowed to have multiple "runs" specified by their "run number". However, again in practice it is very uncommon to have more than one usable run. The ``S11`` folder contains measurements of the reflection coefficients of the sources, along with the LNA itself and the internal switch. These are all made with a VNA, and each reading takes of order a minute. Thus, multiple readings of these measurements can be taken -- and typically are taken. Inside the ``S11`` folder exist a folder for each of the main loads (or sources), in which are measurements of the four standards (``open``, ``short``, ``match`` and ``external``). Each of these standards can be measured multiple times, and so each file has the format ``<standard-name><rep-num>.s1p``, where ``rep-num`` goes from 1 - 99. However, each of the standards for a load is measured one after the other on the same connection (i.e. there is no disconnection between them, to avoid issues with different connection characteristics between the standards). Thus, one can't choose to use repeat number 01 for ``open`` and repeat number 2 for ``short`` for the ``Ambient`` source. For a given source, all standards used must be of the same repeat number (but multiple runs can *exist* for the source). Besides the S11's of the sources, we also need measurements of the LNA reflections, and the internal switch. These exist in the folders ``ReceiverReadingXX`` and ``SwitchingStateXX`` respectively. Here the ``XX`` correspond to what we call a "run" number, which correspond to a complete re-measuring of the standards at different points in the observation process. An arbitrary number of these can be performed (up to 99), but only one is required. In all cases, the default behaviour of ``edges-io`` is to use the *last* run number and repeat number available for any given measurement. Combining Multiple Observations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As of v0.4.0, ``CalibrationObservation`` objects no longer need to be defined fully by one directory containing all measurements. While that is still an option (and the easiest way to define a calibration observation), they can also be defined in a more sophisticated way internally or externally. Internally, a ``definition.yaml`` file is allowed (and encouraged) which defines properties of the observation, and also has ``include`` and ``prefer`` keywords which are used to supplement or override any particular parts of the observation. For example ``include`` could point to the top-level of any other observation, which could then be used whenever the main observation lacks data. If this file exists, by default it is used to construct the full observation virtually. An incomplete example of such a definition file can be found `here <example-obs-definition.yaml>`_. Externally, a different file format is used to explicitly define every single measurement file in an observation. This is supposed to be exhaustive and complete to make it unambiguous. An example can be found in the `test-suite <tests/test_data/observation.yaml>`_. One can use such a file to create a ``CalibrationObservation`` by using the ``CalibrationObservation.from_observation_yaml()`` function. The way the code actually handles these "virtual" observations is essentially to create a temporary directory and make symlinks to all the files that are required. This virtual observation then looks and feels like a normal single observation, but is in fact patched together from various observations. Using the ``HDF5Object`` ------------------------ ``edges-io`` contains a convenient ``HDF5Object`` class whose purpose is to make working with HDF5 files a bit more formal (and arguably more simple). By subclassing it, you can specify an exact file layout that can be verified on read, to ensure a file is in the correct format (not just HDF5, but that it has the correct data structures and groups and metadata). Using such a class is meant to provide a very thin wrapper over the file. So, for instance if you have a file ``my_hdf5_format_file.h5``, whose structure is defined by the class ``CustomH5Format``, you can create an object like this:: >>> fl = CustomH5Format("my_hdf5_format_file.h5") Directly on creation, the file will be checked for compatibility and return an error if it contains extraneous keys, or lacks keys that it requires. Once created, the ``fl`` variable now has operations which can "look into" the file and load its data. It supports lazy-loading, so doing:: >>> print(fl['dataset'].max()) will load the 'dataset' data, and get the maximum, but it will not keep the data in memory, and will not load any other datasets. If you have data in groups, you can easily do:: >>> print(fl['group']['dataset'].min()) To load the data into the object permanently use the ``.load`` method:: >>> fl.load('group') In fact, doing this will load all data under 'group'. If you just wanted to load "dataset" out of "group":: >>> fl['group'].load('dataset') An example of how to define a subclass of ``HDF5Object`` can be seen in the ``HDF5RawSpectrum`` class, which is used to define ``fastspec`` output files. How the code works in a bit more detail --------------------------------------- For the sake of developers (lets face it, most users of this particular repo should also be developers), we will try to explain in a little more detail how the code works here. This will focus on how the code treats the organization of a calibration observation, and how it performs checks and makes fixes. The basic idea is that each directory, and each kind of file, is represented by a distinct class, describing that kind of thing. For example, the top-level directory (actually, the top-level plus the ambient temperature directory) is represented by the ``CalibrationObservation`` class, while the ``Spectra`` directory is represented by the ``Spectra`` class, and S1P files are represented by the ``S1P`` class. All of these classes are subclasses either of ``_DataContainer`` (if it's a folder) or ``_DataFile`` (if it's a file). All of them have a ``path`` attribute which points to its own path on-disk. ``_DataFile`` classes are much simpler, and typically only know how to check its own filename for consistency with the specification, and how to read the data in that particular filetype (they know nothing about their parents). ``_DataContainer`` classes know about their own ``path``, but also can determine a list of files/subfolders they *contain* (they know nothing about their parents), and know how to map these files/folders onto their relevant defining classes. They are able to check their own path for consistency, ensure that all relevant sub-files exist, ensure that none extra exist, and recursively check the consistency of their sub-files and folders by calling their checking methods. Each file and folder in the observation becomes a specific *instance* of one of these classes (there will be multiple ``S1P`` instances for all of the S11 measurements, and each may have a different ``name`` attribute to identify the standard it represents). This top-down hierarchical structure is useful, and similar the to the way Unix filesystems operate. However, it does mean that a particular instance is not necessarily unique: the "match" standard S11 will exist within all sources, and since each class doesn't know its parent, the ``Ambient/Match01.s1p`` cannot be distinguished from the ``HotLoad/Match01.s1p``. However, a method exists on the top-level ``CalibrationObservation`` which can match a particular input path to a unique *sequence* of instances which do uniquely define it (i.e. the first would be a sequence containing a ``LoadS11`` class with ``name=Ambient`` and the second would contain a ``LoadS11`` class with ``name=HotLoad``). Another thing to note about the setup is the different between the classes and *instances* of those classes. Much of the functionality of the system is implemented just through the classes themselves -- one does not need to make instances of the classes to perform the filesystem checks, for instance. In this case, the ``path`` is given to the ``check()`` method of the class, eg. ``CalibrationObservation.check(path)``, which itself will call the ``check`` method of any of its children etc. This will never read any data, it will just check filename formats and contents of directories. However, one can make an *instance* of the ``CalibrationObservation``, which will itself go and make instances of all its children, storing them in the top-level class in a nice hierarchical way, in which each of the children can be used independently. By default, when you create such an instance, it will first perform the full check that would have been performed (but in this case it should exit at the first error raised, and raise it as an error, rather than continuing and printing all errors). Notably, these instances can be used to *read* the data in the files themselves. The instance will also decide *which* files to use in the observation (i.e. which run numbers and repeat numbers). Note ==== This project has been set up using PyScaffold 3.2.3. For details and usage information on PyScaffold see https://pyscaffold.org/.

نیازمندی

مقدار	نام
-	numpy
-	scipy
-	h5py
-	read-acq
-	cached-property
-	bidict
-	pyyaml
-	rich
-	attrs
-	toml
-	dill
-	psutil
>=5.0.0.dev0	hickle
>=5.0	astropy
-	importlib-metadata
>=0.4<1.0	read-acq
-	pre-commit
-	pytest

نحوه نصب

نصب پکیج whl edges-io-4.1.3:

pip install edges-io-4.1.3.whl

نصب پکیج tar.gz edges-io-4.1.3:

pip install edges-io-4.1.3.tar.gz