<resource schema="hsoy">
	<meta name="creationDate">2015-07-16T09:19:00</meta>
	<meta name="schema-rank">100</meta>
	<meta name="title">The HSOY Catalog</meta>
	<meta name="description" format="rst">
	HSOY is a catalog of 583'001'653 objects with precise astrometry based on
	PPMXL and Gaia DR1.  Typical formal errors at mean epoch in proper motion are
	below 1 mas/yr for objects brighter than 10 mag, and about 5 mas/yr at the
	faint end (about 20 mag). South of -30 degrees, astrometry is significantly
	worse.  HSOY also contains, where available, USNO-B, Gaia, and 2MASS
	photometry.  HSOY's positions and proper motions are given for epoch J2000.
	The catalog becomes severely incomplete faintwards of 16 mag in the G-band.
	The mean epochs are typically very close to Gaia's J2015.

	HSOY still contains about 0.7% spurious close
	"binaries" (non-matched stars) from the original USNO-B (marked with non-NULL
	clone).  Also, failed matches within Gaia DR1 contribute another 1.5% spurious
	pairs (marked with non-NULL comp).  In both cases, astrometry presumably is
	sub-standard.

	More information is available at http://dc.g-vo.org/hsoy.
	</meta>
	<meta name="creator">Demleitner, M.; Röser, S.; Altmann, M.; Bastian, U.;
	Schilbach, E.</meta>

	<meta name="subject">stars</meta>
	<meta name="subject">surveys</meta>
	<meta name="subject">astrometry</meta>
	<meta name="subject">proper-motions</meta>

	<meta name="contentLevel">Research</meta>
	<meta name="instrument">Gaia</meta>
	<meta name="source">2017A&amp;A...600L...4A</meta>
	<meta name="doi">10.21938/HP1ASPBYWElyXb6qdqyolg</meta>
	<FEED source="//procs#license-cc0" what="HSOY"/>

	<meta name="_news" author="MD" date="2017-01-31" role="updated">
		Added a column no_sc to mark objects without a counterpart in
		SuperCOSMOS; these are presumably spurious, and filtering on
		no_sc is NULL significantly improves proper motion statistics.
	</meta>
	<meta name="coverage.waveband">Optical</meta>
	<coverage>
		<temporal>2000-01-01 2000-01-01</temporal>
		<spatial>0/0-11</spatial>
		<spectral>8.277e-20 5.369e-19</spectral>
	</coverage>

	<meta name="_longdoc" format="rst"><![CDATA[
		The HSOY catalog is a common reduction of PPMXL and Gaia DR1.  It
		essentially applies the technique discussed in
		:bibcode:`2010AJ....139.2440R`, with PPMXL filling the role of USNO-B
		and Gaia DR1 filling the role of 2MASS.  The concrete implementation
		is fairly different, though.  Interested users are referred to
		the `HSOY resource descriptor`_ (this is a DaCHS file, but actual
		processing is written in SQL), in particular the add_gaia element.

		Access Options
		--------------

		Like Gaia DR1, HSOY is primarily published through a TAP service; the
		primary site is ivo://org.gavo.dc/tap with the access URL
		http://dc.g-vo.org/tap.

		For simple tasks, there is an IVOA cone search service at
		http://dc.g-vo.org/hsoy/q/q/scs.xml (browser interface at
		http://dc.g-vo.org/hsoy/q/q/form).

		If you insist, the catalog content can also be downloaded in the
		form of an `xz-compressed Postgres ASCII dump`_.  This is a 40 GB download,
		and you'll probably spend more time making this useful for you than you'd
		spend `learning ADQL`_ and just using TAP.  If you do pull the dump,
		do, you will find the the metadata (including the column sequence) at
		http://dc.g-vo.org/tableinfo/hsoy.main.  To get this a bit more
		machine-readably, run ``SELECT TOP 1 * FROM hsoy.main`` on our
		TAP service.

		.. _learning ADQL: http://docs.g-vo.org/adql
		.. _xz-compressed Postgres ASCII dump: http://dc.g-vo.org/hsoy/q/download/form

		Statistics
		----------

		The HSOY catalog contains 583'001'683 objects; this is significantly less
		than both the roughly 9e8 of PPMXL and the roughly 1.2e9 of Gaia DR1. The
		bulk of the objects missing in HSOY versus PPMXL should be non-stellar
		detections and failed matches PPMXL inherited from USNO-B1, but of course
		the inhomogeneous coverage of Gaia DR1 can be expected to play a role, too.

		Positions are given for epoch J2000.0 in ICRS (as defined by Hipparcos via
		PPMX).  Since Gaia astrometry is so precise, the mean epochs are fairly
		close to J2015 for almost all stars (mean: around 2014.8), even though many
		stars have observations going back to the 1950s (and older, for PPMXL's
		bright end is PPMX, which contains even older observations).  This means
		that the *positional errors given do not apply* at J2000; formal errors
		in RA and Dec are of the order of 0.1 arcsec for most objects (tighter
		estimates can be derived fromt the errors in proper motion).

		Since the Gaia position is so dominant, the mean error in position at mean
		epoch is of the order of the DR1 positional errors (about 3 mas averaged
		over the magnitude bins, much better at the bright end, much worse at the
		faint end).

		Mean formal errors in proper motion at mean epoch range between better
		than 1 mas/yr for bright stars to about 5 mas/yr near the faint end.
		Note that these are really formal results of a simple least-squares
		reduction. They cannot be considered as absolute accuracies as PPMXL
		has spacial and magnitude dependent distortions in its proper motion
		system. They are slightly improved by incorporating Gaia observations,
		but they are not fully removed. Due to the short life expectancy of HSOY,
		a careful re-reduction of PPMXL is out of question.

		This plot shows the the mean errors in proper motion in RA (red) and Dec
		(blue), in bins of 1 mag:

		.. image:: /\rdId/q/static/e_pm_vs_gmag.png

		Since the early epochs are missing in USNO-B1 south of -30 degrees (no
		early surveys available), the errors in proper motion
		are much larger there; here
		is the distribution of errors in RA:

		.. image:: /\rdId/q/static/e_pmra.average.png

		and in Dec:

		.. image:: /\rdId/q/static/e_pmde.average.png

		To illustrate HSOY's coverage, this is a plot of the number of objects
		per level-6 healpix cell:

		.. image:: /\rdId/q/static/densityplot.png
		
		Here is a table of minimal, mean, and maximal values of the main HSOY
		columns:

		================== =================== ===================== ===================
		           **col**             **min**               **avg**             **max**
		           raj2000  5.684374286825e-07        210.993388137        359.999993202
		           dej2000      -89.9928427252       -9.98338171978        89.9901000584
		          e_raepRA         4.74927e-09    9.14502574957e-07          2.38159e-05
		          e_deepDE         6.27509e-09    7.75452288327e-07          2.36765e-05
		              pmRA         -0.00216075   -4.43904220127e-07           0.00242583
		              pmDE         -0.00245665   -1.09799220875e-06           0.00257913
		            e_pmRA         1.14448e-08    8.34744285904e-07          2.68803e-06
		            e_pmDE         1.13905e-08    8.34238460892e-07          2.62362e-06
		              epRA             1975.62         2014.8336071               2015.0
		              epDE             1978.95        2014.88429072               2015.0
		              Jmag              -0.676        15.1470283958               21.749
		            e_Jmag               0.013      0.0737119978739                9.999
		              Hmag              -1.739        14.5851051481               24.292
		            e_Hmag                 0.0      0.0891082665977                9.999
		              Kmag              -2.099        14.3843849985               20.532
		            e_Kmag                 0.0       0.102864557141                9.999
		             b1mag                 1.6        18.6243196919                 50.0
		             b2mag               0.014        18.7001648827                65.38
		             r1mag              -3.159        17.3583684551                65.47
		             r2mag               0.024         17.710484873                62.38
		              imag               2.171         16.704232572                 64.5
		              nobs                   3   5.5826345577950630                   17
		           gaia_id               65408  4045094535853177834  6917528993283204480
		   phot_g_mean_mag             3.15463        17.8191975575              28.6129
		 e_phot_g_mean_mag         1.06456e-05     0.00625220915035                0.109
		================== =================== ===================== ===================

		G magnitudes and their errors are only given where the flux error as
		given in DR1 is smaller than 10% of the flux.  In that case, the
		flux errors are translated into magnitude errors 1.09*flux_err/flux.

		Known Flaws
		-----------

		(1) Duplicates: PPMXL contains many non-matched stars, i.e., two
		objects where only one should be present.  Where the proper motion
		computed for these stars is roughly correct, these non-matched pairs
		end up in HSOY as well.  All will be matched to the same Gaia DR1
		object, which makes them behave rather odd kinematically (they
		"collide" at the epoch of Gaia).  Such cases are marked with a 1 in
		the clone column.  Clearly, astrometry has to be used with great
		care for these objects.

		(2) Conversely, quite a few PPMXL objects match several Gaia DR1
		objects.  Since the true autocorrelation of Gaia DR1 (and several
		following releases) has a sharp drop at 2 arcsec for technical
		reasons, these objects do typically *not* correspond to multiple
		star systems but reflect match failures in the construction of Gaia
		DR1.  In HSOY, such cases are marked with a non-NULL comp (which may
		remind you of component; in some rare cases, that association could
		actually match reality).  This is a
		plot of (sum(comp)+sum(clone))/n in level-6
		healpix cells, which is a good indicator of the severity of the problems
		with (1) and (2).  The grid pattern represents the overlap areas in
		the surveys that entered into USNO-B, where problem (1) is particularly
		severe:

		.. image:: /\rdId/q/static/pair_density.png

		(3) As discussed in :bibcode:`2010AJ....139.2440R`, PPMXL inherited
		from USNO-B about 2.5e7 spurious objects with high proper motions, mostly
		on the edges of plates, where pairs were matched that should not have been
		matched.  Most of these objects did not hit Gaia DR1 objects when moved to
		J2015.  Thus, HSOY only contains about 2.5e6 objects faster than 150
		mas/yr (2.5e5 on the northern sky).  Comparing this with the expectation
		of less than 1e5 from :bibcode:`2005AJ....129.1483L` in the northern
		sky, it is clear
		that there still are many spurious high-PM objects in HSOY.
		To aid in discarding invalid high-PM objects, we matched
		PPMXL with SuperCOSMOS (cf. :bibcode:`2001MNRAS.326.1279H`), which an
		independent extraction of the plates USNO-B is based on.  Objects failing
		to match at J2000 received a 1 in HSOY's ``no_sc`` column.  Note that
		for technical reasons, False in ``no_sc`` is encoded as NULL.
		Therefore, to filter out probably spurious objects,
		write ``WHERE no_sc IS NULL``.

		This purges another factor of two in bad high-PM objects all-sky
		(to 1.4e6).  On the northern sky, the result is particularly pronounced,
		where only 1.7e5 objects with PM>150 mas/yr are left, well within a factor
		of two of the gold standard set by LSPM.

		Here are comparisons of the distributions of total proper motions in
		HSOY for all (blue) and ``no_sc IS NULL`` (red) objects.  Top or left
		is for the total catalog, bottom or right just the northern sky.

		.. image:: /\rdId/q/static/pm_histograms.png
		
		.. image:: /\rdId/q/static/pm_histograms_north.png

		To get an idea of where the obvious problem on the southern sky comes from,
		see this density map of stars faster than 150 mas/yr in HEALPixes of order
		6, plotted in galactic coordinates.  Only objects for which all of no_sc,
		comp, and clone are NULL are shown.  Note that the scale on the aux axis is
		logarithmic:

		.. image:: /\rdId/q/static/highpm_density.png

		Essentially, in areas so crowded that almost any movement of a star will
		hit another (within our match criteria) there will be many spurious
		high-pm stars, in particular when there are many spurious sources to begin
		with (plate borders).  In sufficient distance from the bulge and the
		Magellanic clouds, high proper motions in HSOY should be fairly reliable
		when filtering out the suspicious (no_sc, clone, comp) objects.

		(4) HSOY inherits the USNO photometry in B, R, and I.  It has many known
		issues (including, but not limited to, ridiculous values like 65.47
		shown in the table above).  Do not lightly use b1mag, b2mag, r1mag, r2mag,
		and imag, as many of them are invalid.

		(5) At the bright end, HSOY is severely incomplete.  This reflects
		the corresponding incompleteness of Gaia DR1.

		Trivia
		------

		The primary key of HSOY is the tuple (ipix,comp).  The suggested reference
		to a HSOY object is "HSOY <ipix>.0" when comp is NULL, and "HSOY
		<ipix>.<comp>" otherwise.
		
		If you want to cite HSOY, please see `our citation advice`_.

		.. _HSOY resource descriptor: http://svn.ari.uni-heidelberg.de/svn/gavo/hdinputs/hsoy/q.rd
		.. _our citation advice: /hsoy/q/q/howtocite

	]]></meta>


	<!-- in DaCHS svn revision 6098, MD removed the "provenance", i.e.,
	the tables and the code that produced the HSOY.  See in svn or in the
	README if you want to recover it. -->

	<table id="main" onDisk="True" mixin="//scs#q3cindex"
			namePath="ppmxl/q#main" adql="True">
		<meta name="title">Hsoy Object Catalog</meta>

		<index columns="phot_g_mean_mag">
			<FEED source="%#staticindex"/>
		</index>
		<index columns="Jmag">
			<FEED source="%#staticindex"/>
		</index>
		<index columns="Kmag">
			<FEED source="%#staticindex"/>
		</index>
		<index columns="gaia_id">
			<FEED source="%#staticindex"/>
		</index>
		<index columns="ipix">
			<FEED source="%#staticindex"/>
		</index>
		<index columns="raj2000">
			<FEED source="%#staticindex"/>
		</index>
		<index columns="dej2000">
			<FEED source="%#staticindex"/>
		</index>
		
		<column original="ipix"
			ucd="meta.id.cross"
			description="The PPMXL object identifier, which in turn is the
			q3c ipix of the original USNO-B object; for HSOY, only (ipix, comp)
			is a unique identifier (primary key).  The recommended identifier
			form is 'HSOY ipix.comp', where comp=0 for NULL comps.  This is
			what the SCS generates."/>
		<column name="comp" type="smallint"
			ucd="meta.code.multip"
			tablehead="Comp"
			description="If non-null this indicates that multiple Gaia
				objects matched the PPMXL object; this may indicate
				bona fide multiple stars, but more likely is due to failed
				matching of Gaia observations at different epochs.  In both
				cases, proper motions must be used with care. The index
				is artificial, i.e., no primary, secondary, etc, is implied.
				ipix+comp together are a primary key to hsoy.main."
			verbLevel="5">
			<values nullLiteral="0"/>
		</column>

		<LOOP listItems="raj2000 dej2000 e_raepRA e_deepDE pmRA pmDE
				e_pmRA e_pmDE epRA epDE Jmag e_Jmag Hmag e_Hmag Kmag e_Kmag
				b1mag b2mag r1mag r2mag imag magSurveys">
			<events>
				<column original="\item"/>
			</events>
		</LOOP>
		<column name="nobs" type="smallint"
			ucd="meta.number;obs"
			tablehead="#Obs"
			description="Number of observations contributing to this column
				(always nobs(ppmxl)+1)"
			verbLevel="25">
			<values nullLiteral="0"/>
		</column>

		<column original="gaia/q#dr1.source_id" name="gaia_id"
			ucd="meta.id"/>

		<column original="gaia/q#dr1.phot_g_mean_mag"
			description="Mean magnitude in the G band from Gaia DR1.
				Magnitudes for which err_flux/flux&gt;0.1 have been dropped."/>
		<column original="gaia/q#dr1.phot_g_mean_flux_error"
			name="e_phot_g_mean_mag"
			ucd="stat.error;phot.mag;em.opt.V"
			tablehead="Err. m_G"
			description="Estimated error in Gaia G-band magnitude.  This
				is estimated as 1.09*err_flux/flux which is good as a symmetric
				1 σ-error of the magnitude to at least within a few percent
				when err_flux/flux is smaller than 0.1, as it is for the
				HSOY objects."
			verbLevel="15"/>
		<column name="clone" type="smallint"
			ucd="meta.code.qual"
			tablehead="Clone"
			description="If 1, more than one PPMXL object matched to this Gaia object
				(i.e.: proper motion is probably wrong, any apparent duplicity is
				probably spurious). This is normally due to failed matching of
				objects from different plates in USNO-B."
			verbLevel="15" note="cl">
			<values nullLiteral="-1"/>
		</column>

		<column name="no_sc" type="smallint"
			ucd="meta.code.qual"
			tablehead="Spurious"
			description="1 if this object had no match within 3 arcseconds in
				SuperCosmos at J2000.  It is very likely that it is not
				a real object.  NOTE: False is encoded as NULL in the database; to
				exclude objects without a supercosmos match, write
				no_sc IS NULL."
			verbLevel="15">
			<values nullLiteral="0"/>
		</column>

		<meta name="note" tag="cl">
				During the construction of USNO-B, numerous observations
				were not matched properly.  As a result, at least 10% of USNO-B objects
				are spurious.  Most of these carried over into PPMXL.  Since many
				of these objects have completely errneous proper motions, much fewer
				will be in HSOY.  However, they are present, and they are characterised
				by multiple PPMXL objects being matched to one Gaia object.
				Since PPMXL's resolution is much worse than Gaia's, these will almost
				always be spurious; also, since the Gaia position is fixed in
				both solutions, their paths will cross fairly near to J2015.0.
		</meta>
	</table>

	<data id="import" auto="False">
		<sources pattern="static/hsoy.dump.xz"/>
		<directGrammar id="booster"
			cBooster="res/parsedump.c" type="split" splitChar="|"
			preFilter="xzcat"/>
		<make table="main"/>
	</data>

	<service id="download" allowed="form">
		<!-- I don't want to have the endless download to be handled by DaCHS,
		as I'd hate it if I'd have to wait for many hours for one of these
		huge downloads to finish.  Thus, I let apache do the actual download;
		the redirect below assumes hsoy/static is readable for the web server
		and there's something like

		Alias /hsoy /data/gavo/inputs/hsoy/static/

		in 000-default.conf on alinlam -->

		<meta name="title">Download guard for hsoy dump</meta>
		<template key="form">res/downloadguard.html</template>
		<pythonCore>
			<inputTable>
				<inputKey type="text" name="input" multiplicity="single"
					tablehead="I read the warning"
					description="Type in 'yes' here to pull the data"/>
			</inputTable>
			<outputTable/>
			<coreProc>
				<code>
					from gavo import svcs

					if inputTable.getParam("input")=="yes":
						raise svcs.WebRedirect(
							"http://vo.ari.uni-heidelberg.de/hsoy/hsoy.dump.xz?")
					else:
						raise base.ValidationError("This must be 'yes' (without any"
							" quotes)", "input")
				</code>
			</coreProc>
		</pythonCore>
	</service>

	<service id="q" allowed="scs.xml,form,static">
		<meta name="shortName">HSOY SCS</meta>
		<publish render="form" sets="ivo_managed,local"/>
		<publish render="scs.xml" sets="ivo_managed"/>

		<property name="staticData">static</property>
		<meta>
			testQuery.ra: 31.9
			testQuery.dec: 61.7
		</meta>
		<scsCore queriedTable="main">
			<FEED source="//scs#coreDescs"/>
			<condDesc buildFrom="phot_g_mean_mag"/>
			<condDesc buildFrom="Kmag"/>
			<condDesc buildFrom="gaia_id"/>
			<condDesc buildFrom="ipix"/>
			<outputTable>
				<column name="hsoyid" type="text"
					ucd="meta.id;meta.main"
					tablehead="HSOY Id"
					description="HSOY identifier (ipix plus comp from original table)"
					verbLevel="1"
					select="'HSOY ' || ipix || (CASE WHEN comp IS NOT NULL
						THEN '.' || comp
						ELSE '.0' END)"/>
				<LOOP listItems="gaia_id clone phot_g_mean_mag e_phot_g_mean_mag
					raj2000 dej2000 e_raepRA e_deepDE pmRA pmDE
					e_pmRA e_pmDE epRA epDE Jmag e_Jmag Hmag e_Hmag Kmag e_Kmag
					b1mag b2mag r1mag r2mag imag magSurveys nobs">
					<events>
						<outputField original="\item"/>
					</events>
				</LOOP>
			</outputTable>
		</scsCore>
	</service>

	<regSuite title="HSOY regression">
		<regTest title="HSOY SCS yields plausible values">
			<url RA="276.5545" DEC="-17.8694" SR="0.002">q/scs.xml</url>
			<code>
				rows = self.getVOTableRows()
				self.assertEqual(len(rows), 1)
				self.assertEqual(rows[0]["hsoyid"], "HSOY 5055227856149121395.0")
				self.assertAlmostEqual(rows[0]["e_phot_g_mean_mag"],
					0.00177645)
			</code>
		</regTest>

		<regTest title="Catalog download works">
			<url>download/form</url>
			<code>
				self.assertHasStrings("You are about to download HSOY",
					"I read the warning&lt;/label>")
			</code>
		</regTest>

		<regTest title="Proper redirect is served">
			<url parSet="form" input="yes">download/form</url>
			<code>
				self.assertHeader("location",
					"http://vo.ari.uni-heidelberg.de/hsoy/hsoy.dump.xz?")
				self.assertHTTPStatus(301)
			</code>
		</regTest>
		
		<regTest title="HSOY dump exists" tags="bigserver">
			<url httpMethod="HEAD"
				>http://vo.ari.uni-heidelberg.de/hsoy/hsoy.dump.xz?</url>
			<code>
				self.assertHTTPStatus(200)
			</code>
		</regTest>
	</regSuite>
</resource>
