Help on function merge in module pandas.core.reshape.merge:
merge(left, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)
Merge DataFrame objects by performing a database-style join operation by
columns or indexes.
If joining columns on columns, the DataFrame indexes *will be
ignored*. Otherwise if joining indexes on indexes or indexes on a column or
columns, the index will be passed on.
Parameters
----------
left : DataFrame
right : DataFrame
how : {'left', 'right', 'outer', 'inner'}, default 'inner'
* left: use only keys from left frame, similar to a SQL left outer join;
preserve key order
* right: use only keys from right frame, similar to a SQL right outer join;
preserve key order
* outer: use union of keys from both frames, similar to a SQL full outer
join; sort keys lexicographically
* inner: use intersection of keys from both frames, similar to a SQL inner
join; preserve the order of the left keys
on : label or list
Column or index level names to join on. These must be found in both
DataFrames. If `on` is None and not merging on indexes then this defaults
to the intersection of the columns in both DataFrames.
left_on : label or list, or array-like
Column or index level names to join on in the left DataFrame. Can also
be an array or list of arrays of the length of the left DataFrame.
These arrays are treated as if they are columns.
right_on : label or list, or array-like
Column or index level names to join on in the right DataFrame. Can also
be an array or list of arrays of the length of the right DataFrame.
These arrays are treated as if they are columns.
left_index : boolean, default False
Use the index from the left DataFrame as the join key(s). If it is a
MultiIndex, the number of keys in the other DataFrame (either the index
or a number of columns) must match the number of levels
right_index : boolean, default False
Use the index from the right DataFrame as the join key. Same caveats as
left_index
sort : boolean, default False
Sort the join keys lexicographically in the result DataFrame. If False,
the order of the join keys depends on the join type (how keyword)
suffixes : 2-length sequence (tuple, list, ...)
Suffix to apply to overlapping column names in the left and right
side, respectively
copy : boolean, default True
If False, do not copy data unnecessarily
indicator : boolean or string, default False
If True, adds a column to output DataFrame called "_merge" with
information on the source of each row.
If string, column with information on source of each row will be added to
output DataFrame, and column will be named value of string.
Information column is Categorical-type and takes on a value of "left_only"
for observations whose merge key only appears in 'left' DataFrame,
"right_only" for observations whose merge key only appears in 'right'
DataFrame, and "both" if the observation's merge key is found in both.
validate : string, default None
If specified, checks if merge is of specified type.
* "one_to_one" or "1:1": check if merge keys are unique in both
left and right datasets.
* "one_to_many" or "1:m": check if merge keys are unique in left
dataset.
* "many_to_one" or "m:1": check if merge keys are unique in right
dataset.
* "many_to_many" or "m:m": allowed, but does not result in checks.
.. versionadded:: 0.21.0
Notes
-----
Support for specifying index levels as the `on`, `left_on`, and
`right_on` parameters was added in version 0.23.0
Examples
--------
>>> A >>> B
lkey value rkey value
0 foo 1 0 foo 5
1 bar 2 1 bar 6
2 baz 3 2 qux 7
3 foo 4 3 bar 8
>>> A.merge(B, left_on='lkey', right_on='rkey', how='outer')
lkey value_x rkey value_y
0 foo 1 foo 5
1 foo 4 foo 5
2 bar 2 bar 6
3 bar 2 bar 8
4 baz 3 NaN NaN
5 NaN NaN qux 7
Returns
-------
merged : DataFrame
The output type will the be same as 'left', if it is a subclass
of DataFrame.
See also
--------
merge_ordered
merge_asof
DataFrame.join
Help on function concat inmodule pandas.core.reshape.concat:
concat(objs, axis=0, join='outer', join_axes=None, ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=None, copy=True)
Concatenate pandas objects along a particular axis with optional set logic
along the other axes.
Can also add a layer of hierarchical indexing on the concatenation axis,
which may be useful if the labels are the same (or overlapping) on
the passed axis number.
Parameters
----------
objs : a sequence or mapping ofSeries, DataFrame, or Panel objects
If a dict is passed, the sorted keys will be used as the `keys`
argument, unless it is passed, in which case the values will be
selected (see below). AnyNone objects will be dropped silently unless
they are all Nonein which case a ValueError will be raised
axis : {0/'index', 1/'columns'}, default0The axis to concatenate along
join : {'inner', 'outer'}, default'outer'How to handle indexes on other axis(es)
join_axes : list ofIndex objects
Specific indexes to use for the other n - 1 axes instead of performing
inner/outer set logic
ignore_index : boolean, defaultFalseIfTrue, do not use the index values along the concatenation axis. The
resulting axis will be labeled 0, ..., n - 1.This is useful if you are
concatenating objects where the concatenation axis does not have
meaningful indexing information. Note the index values on the other
axes are still respected in the join.
keys : sequence, defaultNoneIf multiple levels passed, should contain tuples. Construct
hierarchical index using the passed keys as the outermost level
levels : list of sequences, defaultNoneSpecific levels (unique values) to use for constructing a
MultiIndex. Otherwise they will be inferred from the keys
names : list, defaultNoneNamesfor the levels in the resulting hierarchical index
verify_integrity : boolean, defaultFalseCheck whether the new concatenated axis contains duplicates. This can
be very expensive relative to the actual data concatenation
sort : boolean, defaultNoneSort non-concatenation axis if it is not already aligned when `join`
is 'outer'. The current defaultof sorting is deprecated and will
change to not-sorting in a future version of pandas.
Explicitly pass ``sort=True`` to silence the warning and sort.
Explicitly pass ``sort=False`` to silence the warning and not sort.
This has no effect when ``join='inner'``, which already preserves
the order of the non-concatenation axis.
.. versionadded:: 0.23.0
copy : boolean, defaultTrueIfFalse, do not copy data unnecessarily
Returns
-------
concatenated : object, typeof objs
When concatenating all ``Series`` along the index (axis=0), a
``Series`` is returned. When``objs`` contains at least one
``DataFrame``, a ``DataFrame`` is returned. When concatenating along
the columns (axis=1), a ``DataFrame`` is returned.
Notes
-----
The keys, levels, and names arguments are all optional.
A walkthrough of how this method fits inwith other tools for combining
pandas objects can be found `here
<http://pandas.pydata.org/pandas-docs/stable/merging.html>`__.
SeeAlso
--------
Series.appendDataFrame.appendDataFrame.joinDataFrame.mergeExamples
--------
Combine two ``Series``.
>>> s1 = pd.Series(['a', 'b'])
>>> s2 = pd.Series(['c', 'd'])
>>> pd.concat([s1, s2])
0 a
1 b
0 c
1 d
dtype: objectClear the existing index and reset it in the result
by setting the ``ignore_index`` option to ``True``.
>>> pd.concat([s1, s2], ignore_index=True)
0 a
1 b
2 c
3 d
dtype: objectAdd a hierarchical index at the outermost level of
the data with the ``keys`` option.
>>> pd.concat([s1, s2], keys=['s1', 's2',])
s1 0 a
1 b
s2 0 c
1 d
dtype: objectLabel the index keys you create with the ``names`` option.
>>> pd.concat([s1, s2], keys=['s1', 's2'],
... names=['Series name', 'Row ID'])
Series name RowID
s1 0 a
1 b
s2 0 c
1 d
dtype: objectCombine two ``DataFrame`` objects with identical columns.
>>> df1 = pd.DataFrame([['a', 1], ['b', 2]],
... columns=['letter', 'number'])
>>> df1
letter number0 a 11 b 2
>>> df2 = pd.DataFrame([['c', 3], ['d', 4]],
... columns=['letter', 'number'])
>>> df2
letter number0 c 31 d 4
>>> pd.concat([df1, df2])
letter number0 a 11 b 20 c 31 d 4Combine``DataFrame`` objects with overlapping columns
and return everything. Columns outside the intersection will
be filled with``NaN`` values.
>>> df3 = pd.DataFrame([['c', 3, 'cat'], ['d', 4, 'dog']],
... columns=['letter', 'number', 'animal'])
>>> df3
letter number animal
0 c 3 cat
1 d 4 dog
>>> pd.concat([df1, df3])
animal letter number0NaN a 11NaN b 20 cat c 31 dog d 4Combine``DataFrame`` objects with overlapping columns
and return only those that are shared by passing ``inner`` to
the ``join`` keyword argument.
>>> pd.concat([df1, df3], join="inner")
letter number0 a 11 b 20 c 31 d 4Combine``DataFrame`` objects horizontally along the x axis by
passing in``axis=1``.
>>> df4 = pd.DataFrame([['bird', 'polly'], ['monkey', 'george']],
... columns=['animal', 'name'])
>>> pd.concat([df1, df4], axis=1)
letter number animal name
0 a 1 bird polly
1 b 2 monkey george
Prevent the result from including duplicate index values with the
``verify_integrity`` option.
>>> df5 = pd.DataFrame([1], index=['a'])
>>> df5
0
a 1
>>> df6 = pd.DataFrame([2], index=['a'])
>>> df6
0
a 2
>>> pd.concat([df5, df6], verify_integrity=True)
Traceback (most recent call last):
...
ValueError: Indexes have overlapping values: ['a']
Help on function date_range inmodule pandas.core.indexes.datetimes:
date_range(start=None, end=None, periods=None, freq=None, tz=None, normalize=False, name=None, closed=None, **kwargs)
Return a fixed frequency DatetimeIndex.
Parameters
----------
start : str or datetime-like, optional
Left bound for generating dates.
end : str or datetime-like, optional
Right bound for generating dates.
periods : integer, optional
Numberof periods to generate.
freq : str or DateOffset, default'D' (calendar daily)
Frequency strings can have multiples, e.g. '5H'. See
:ref:`here <timeseries.offset_aliases>`for a list of
frequency aliases.
tz : str or tzinfo, optional
Time zone name for returning localized DatetimeIndex, for example
'Asia/Hong_Kong'. Bydefault, the resulting DatetimeIndex is
timezone-naive.
normalize : bool, defaultFalseNormalize start/end dates to midnight before generating date range.
name : str, defaultNoneNameof the resulting DatetimeIndex.
closed : {None, 'left', 'right'}, optional
Make the interval closed with respect to the given frequency to
the 'left', 'right', or both sides (None, the default).
**kwargs
For compatibility. Has no effect on the result.
Returns
-------
rng : DatetimeIndexSeeAlso
--------
pandas.DatetimeIndex : An immutable container for datetimes.
pandas.timedelta_range : Return a fixed frequency TimedeltaIndex.
pandas.period_range : Return a fixed frequency PeriodIndex.
pandas.interval_range : Return a fixed frequency IntervalIndex.
Notes
-----
Of the four parameters ``start``, ``end``, ``periods``, and ``freq``,
exactly three must be specified. If``freq`` is omitted, the resulting
``DatetimeIndex`` will have ``periods`` linearly spaced elements between
``start`` and ``end`` (closed on both sides).
To learn more about the frequency strings, please see `this link
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`__.
Examples
--------
**Specifying the values**
The next four examples generate the same `DatetimeIndex`, but vary
the combination of`start`, `end` and `periods`.
Specify`start` and `end`, with the default daily frequency.
>>> pd.date_range(start='1/1/2018', end='1/08/2018')
DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
'2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08'],
dtype='datetime64[ns]', freq='D')
Specify`start` and `periods`, the numberof periods (days).
>>> pd.date_range(start='1/1/2018', periods=8)
DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
'2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08'],
dtype='datetime64[ns]', freq='D')
Specify`end` and `periods`, the numberof periods (days).
>>> pd.date_range(end='1/1/2018', periods=8)
DatetimeIndex(['2017-12-25', '2017-12-26', '2017-12-27', '2017-12-28',
'2017-12-29', '2017-12-30', '2017-12-31', '2018-01-01'],
dtype='datetime64[ns]', freq='D')
Specify`start`, `end`, and `periods`; the frequency is generated
automatically (linearly spaced).
>>> pd.date_range(start='2018-04-24', end='2018-04-27', periods=3)
DatetimeIndex(['2018-04-24 00:00:00', '2018-04-25 12:00:00',
'2018-04-27 00:00:00'], freq=None)
**OtherParameters**
Changed the `freq` (frequency) to ``'M'`` (month end frequency).
>>> pd.date_range(start='1/1/2018', periods=5, freq='M')
DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31', '2018-04-30',
'2018-05-31'],
dtype='datetime64[ns]', freq='M')
Multiples are allowed
>>> pd.date_range(start='1/1/2018', periods=5, freq='3M')
DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31',
'2019-01-31'],
dtype='datetime64[ns]', freq='3M')
`freq` can also be specified as an Offsetobject.
>>> pd.date_range(start='1/1/2018', periods=5, freq=pd.offsets.MonthEnd(3))
DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31',
'2019-01-31'],
dtype='datetime64[ns]', freq='3M')
Specify`tz` to set the timezone.
>>> pd.date_range(start='1/1/2018', periods=5, tz='Asia/Tokyo')
DatetimeIndex(['2018-01-01 00:00:00+09:00', '2018-01-02 00:00:00+09:00',
'2018-01-03 00:00:00+09:00', '2018-01-04 00:00:00+09:00',
'2018-01-05 00:00:00+09:00'],
dtype='datetime64[ns, Asia/Tokyo]', freq='D')
`closed` controls whether to include `start` and `end` that are on the
boundary. Thedefault includes boundary points on either end.
>>> pd.date_range(start='2017-01-01', end='2017-01-04', closed=None)
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04'],
dtype='datetime64[ns]', freq='D')
Use``closed='left'`` to exclude `end`if it falls on the boundary.
>>> pd.date_range(start='2017-01-01', end='2017-01-04', closed='left')
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03'],
dtype='datetime64[ns]', freq='D')
Use``closed='right'`` to exclude `start`if it falls on the boundary.
>>> pd.date_range(start='2017-01-01', end='2017-01-04', closed='right')
DatetimeIndex(['2017-01-02', '2017-01-03', '2017-01-04'],
dtype='datetime64[ns]', freq='D')
Help on classTimedeltainmodulepandas._libs.tslibs.timedeltas:
classTimedelta(_Timedelta)
| Timedelta(value=<objectobjectat 0x000002BFBD410440>, unit=None, **kwargs)
|
| Representsaduration, thedifferencebetweentwodatesortimes.
|
| Timedeltaisthepandasequivalentofpython's ``datetime.timedelta``
| andisinterchangeablewithitinmostcases.
|
| Parameters
| ----------
| value : Timedelta, timedelta, np.timedelta64, string, orinteger
| unit : string, {'ns', 'us', 'ms', 's', 'm', 'h', 'D'}, optional|Denote the unit of the input, if input is an integer. Default 'ns'.
| days, seconds, microseconds,
| milliseconds, minutes, hours, weeks : numeric, optional|Valuesfor construction in compat with datetime.timedelta.
| np ints and floats will be coereced to python ints and floats.
||Notes|-----|The ``.value`` attribute is always in ns.
||Method resolution order:
|Timedelta| _Timedelta
| datetime.timedelta
| builtins.object
||Methods defined here:
|| __abs__(self)
|| __add__(self, other)
|| __divmod__(self, other)
|| __floordiv__(self, other)
|| __inv__(self)
|| __mod__(self, other)
|| __mul__(self, other)
|| __neg__(self)
|| __new__(cls, value=<object object at 0x000002BFBD410440>, unit=None, **kwargs)
|| __pos__(self)
|| __radd__(self, other)
|| __rdivmod__(self, other)
|| __reduce__(self)
|| __rfloordiv__(self, other)
|| __rmod__(self, other)
|| __rmul__ = __mul__(self, other)
|| __rsub__(self, other)
|| __rtruediv__(self, other)
|| __setstate__(self, state)
|| __sub__(self, other)
|| __truediv__(self, other)
|| ceil(self, freq)
|return a new Timedelta ceiled to this resolution
||Parameters|----------| freq : a freq string indicating the ceiling resolution
|| floor(self, freq)
|return a new Timedelta floored to this resolution
||Parameters|----------| freq : a freq string indicating the flooring resolution
|| round(self, freq)
|Round the Timedelta to the specified resolution
||Returns|-------| a new Timedelta rounded to the given resolution of `freq`
||Parameters|----------| freq : a freq string indicating the rounding resolution
||Raises|------|ValueErrorif the freq cannot be converted
||----------------------------------------------------------------------|Data descriptors defined here:
|| __dict__
| dictionary for instance variables (if defined)
|| __weakref__
| list of weak references to the object (if defined)
||----------------------------------------------------------------------|Data and other attributes defined here:
|| max =Timedelta('106751 days 23:47:16.854775')
|| min =Timedelta('-106752 days +00:12:43.145224')
||----------------------------------------------------------------------|Methods inherited from _Timedelta:
|| __bool__(self, /)
|self!=0|| __eq__(self, value, /)
|Returnself==value.
|| __ge__(self, value, /)
|Returnself>=value.
|| __gt__(self, value, /)
|Returnself>value.
|| __hash__(self, /)
|Return hash(self).
|| __le__(self, value, /)
|Returnself<=value.
|| __lt__(self, value, /)
|Returnself<value.
|| __ne__(self, value, /)
|Returnself!=value.
|| __reduce_cython__(...)
|| __repr__(self, /)
|Return repr(self).
|| __setstate_cython__(...)
|| __str__(self, /)
|Return str(self).
|| isoformat(...)
|FormatTimedeltaasISO8601Duration like
| ``P[n]Y[n]M[n]DT[n]H[n]M[n]S``, where the ``[n]`` s are replaced by the
| values. See https://en.wikipedia.org/wiki/ISO_8601#Durations||.. versionadded:: 0.20.0||Returns|-------| formatted : str
||Notes|-----|The longest component is days, whose value may be larger than
|365.
|Every component is always included, even if its value is0.
|Pandas uses nanosecond precision, so up to 9 decimal places may
| be included in the seconds component.
|Trailing0's are removed from the seconds component after the decimal.
|Wedo not 0 pad components, so it's `...T5H...`, not `...T05H...`
||Examples|--------|>>> td = pd.Timedelta(days=6, minutes=50, seconds=3,
|... milliseconds=10, microseconds=10, nanoseconds=12)
|>>> td.isoformat()
| 'P6DT0H50M3.010010012S'
|>>> pd.Timedelta(hours=1, seconds=10).isoformat()
| 'P0DT0H0M10S'
|>>> pd.Timedelta(hours=1, seconds=10).isoformat()
| 'P0DT0H0M10S'
|>>> pd.Timedelta(days=500.5).isoformat()
| 'P500DT12H0MS'
||SeeAlso|--------|Timestamp.isoformat
|| to_pytimedelta(...)
|return an actual datetime.timedelta object
| note: we lose nanosecond resolution if any
|| to_timedelta64(...)
|Returns a numpy.timedelta64 object with 'ns' precision
|| total_seconds(...)
|Total duration of timedelta in seconds (to ns precision)
|| view(...)
| array view compat
||----------------------------------------------------------------------|Data descriptors inherited from _Timedelta:
|| asm8
|return a numpy timedelta64 array view of myself
|| components
|Return a ComponentsNamedTuple-like
|| delta
|Return the timedelta in nanoseconds (ns), forinternal compatibility.
||Returns|-------| int
|Timedeltain nanoseconds.
||Examples|--------|>>> td = pd.Timedelta('1 days 42 ns')
|>>> td.delta
|86400000000042||>>> td = pd.Timedelta('3 s')
|>>> td.delta
|3000000000||>>> td = pd.Timedelta('3 ms 5 us')
|>>> td.delta
|3005000||>>> td = pd.Timedelta(42, unit='ns')
|>>> td.delta
|42|| freq
|| is_populated
|| nanoseconds
|Return the number of nanoseconds (n), where0<= n <1 microsecond.
||Returns|-------| int
|Number of nanoseconds.
||SeeAlso|--------|Timedelta.components : Return all attributes with assigned values
| (i.e. days, hours, minutes, seconds, milliseconds, microseconds,
| nanoseconds).
||Examples|--------|**Using string input**||>>> td = pd.Timedelta('1 days 2 min 3 us 42 ns')
|>>> td.nanoseconds
|42||**Using integer input**||>>> td = pd.Timedelta(42, unit='ns')
|>>> td.nanoseconds
|42|| resolution
|return a string representing the lowest resolution that we have
|| value
||----------------------------------------------------------------------|Data and other attributes inherited from _Timedelta:
|| __array_priority__ =100|| __pyx_vtable__ =<capsule object NULL>||----------------------------------------------------------------------|Methods inherited from datetime.timedelta:
|| __getattribute__(self, name, /)
|Return getattr(self, name).
||----------------------------------------------------------------------|Data descriptors inherited from datetime.timedelta:
|| days
|Number of days.
|| microseconds
|Number of microseconds (>=0 and less than 1 second).
|| seconds
|Number of seconds (>=0 and less than 1 day).
pd.Timedelta('2 days 2 hours 15 minutes 30 seconds')
Timedelta('2 days 02:15:30')
pd.Timedelta(6, unit='h')
Timedelta('0 days 06:00:00')
pd.Timedelta(days=2)
Timedelta('2 days 00:00:00')
to_timedelta()
pd.to_timedelta(['2 days 2 hours 15 minutes 30 seconds', '2 days 2 hours 15 minutes 30 seconds'])
TimedeltaIndex(['2 days 02:15:30', '2 days 02:15:30'], dtype='timedelta64[ns]', freq=None)
s = pd.Series(pd.date_range('2012-1-1', periods=3, freq='D'))
td = pd.Series([ pd.Timedelta(days=i) for i inrange(3) ])
df = pd.DataFrame(dict(A = s, B = td))
df
A
B
0
2012-01-01
0 days
1
2012-01-02
1 days
2
2012-01-03
2 days
df['C'] = df['A'] + df['B']
df
A
B
C
0
2012-01-01
0 days
2012-01-01
1
2012-01-02
1 days
2012-01-03
2
2012-01-03
2 days
2012-01-05
df['D'] = df['C'] - df['B']
df
A
B
C
D
0
2012-01-01
0 days
2012-01-01
2012-01-01
1
2012-01-02
1 days
2012-01-03
2012-01-02
2
2012-01-03
2 days
2012-01-05
2012-01-03
Categorical Data
category
s = pd.Series(["a", 'b', 'c', 'a'], dtype='category')
s
Help on class Categorical in module pandas.core.arrays.categorical:
class Categorical(pandas.core.arrays.base.ExtensionArray, pandas.core.base.PandasObject)
| Categorical(values, categories=None, ordered=None, dtype=None, fastpath=False)
|
| Represents a categorical variable in classic R / S-plus fashion
|
| `Categoricals` can only take on only a limited, and usually fixed, number
| of possible values (`categories`). In contrast to statistical categorical
| variables, a `Categorical` might have an order, but numerical operations
| (additions, divisions, ...) are not possible.
|
| All values of the `Categorical` are either in `categories`or`np.nan`.
| Assigning values outside of `categories` will raise a `ValueError`. Order
| is defined by the order of the `categories`, not lexical order of the
| values.
|
| Parameters
| ----------
| values : list-like
| The values of the categorical. If categories are given, valuesnot in
| categories will be replaced with NaN.
| categories : Index-like (unique), optional
| The unique categories for this categorical. If notgiven, the
| categories are assumed to be the unique values of values.
| ordered : boolean, (default False)
| Whether ornot this categorical is treated as a ordered categorical.
| If notgiven, the resulting categorical will not be ordered.
| dtype : CategoricalDtype
| An instance of ``CategoricalDtype`` to usefor this categorical
|
| .. versionadded:: 0.21.0
|
| Attributes
| ----------
| categories : Index
| The categories of this categorical
| codes : ndarray
| The codes (integer positions, which point to the categories) of this
| categorical, read only.
| ordered : boolean
| Whether ornot this Categorical is ordered.
| dtype : CategoricalDtype
| The instance of ``CategoricalDtype`` storing the ``categories``
| and``ordered``.
|
| .. versionadded:: 0.21.0
|
| Methods
| -------
| from_codes
| __array__
|
| Raises
| ------
| ValueError
| If the categories donot validate.
| TypeError
| If an explicit ``ordered=True`` is given but no`categories`and the
| `values` are not sortable.
|
| Examples
| --------
| >>> pd.Categorical([1, 2, 3, 1, 2, 3])
| [1, 2, 3, 1, 2, 3]
| Categories (3, int64): [1, 2, 3]
|
| >>> pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c'])
| [a, b, c, a, b, c]
| Categories (3, object): [a, b, c]
|
| Ordered `Categoricals` can be sorted according to the custom order
| of the categories and can have a min and max value.
|
| >>> c = pd.Categorical(['a','b','c','a','b','c'], ordered=True,
| ... categories=['c', 'b', 'a'])
| >>> c
| [a, b, c, a, b, c]
| Categories (3, object): [c < b < a]
| >>> c.min()
| 'c'
|
| Notes
| -----
| See the `user guide
| <http://pandas.pydata.org/pandas-docs/stable/categorical.html>`_for more.
|
| See also
| --------
| pandas.api.types.CategoricalDtype : Type for categorical data
| CategoricalIndex : An Index with an underlying ``Categorical``
|
| Method resolution order:
| Categorical
| pandas.core.arrays.base.ExtensionArray
| pandas.core.base.PandasObject
| pandas.core.base.StringMixin
| pandas.core.accessor.DirNamesMixin
| builtins.object
|
| Methods defined here:
|
| __array__(self, dtype=None)
| The numpy array interface.
|
| Returns
| -------
| values : numpy array
| A numpy array of either the specified dtype or,
| if dtype==None (default), the same dtype as
| categorical.categories.dtype
|
| __eq__(self, other)
|
| __ge__(self, other)
|
| __getitem__(self, key)
| Return an item.
|
| __gt__(self, other)
|
| __init__(self, values, categories=None, ordered=None, dtype=None, fastpath=False)
| Initialize self. See help(type(self)) for accurate signature.
|
| __iter__(self)
| Returns an Iterator over the values of this Categorical.
|
| __le__(self, other)
|
| __len__(self)
| The length of this Categorical.
|
| __lt__(self, other)
|
| __ne__(self, other)
|
| __setitem__(self, key, value)
| Item assignment.
|
|
| Raises
| ------
| ValueError
| If (one or more) Value is not in categories orif a assigned
| `Categorical` does not have the same categories
|
| __setstate__(self, state)
| Necessary for making this object picklable
|
| __unicode__(self)
| Unicode representation.
|
| add_categories(self, new_categories, inplace=False)
| Add new categories.
|
| `new_categories` will be included at the last/highest place in the
| categories and will be unused directly after this call.
|
| Raises
| ------
| ValueError
| If the new categories include old categories ordonot validate as
| categories
|
| Parameters
| ----------
| new_categories : category or list-like of category
| The new categories to be included.
| inplace : boolean (default: False)
| Whether ornot to add the categories inplace orreturn a copy of
| this categorical with added categories.
|
| Returns
| -------
| cat : Categorical with new categories added or None if inplace.
|
| See also
| --------
| rename_categories
| reorder_categories
| remove_categories
| remove_unused_categories
| set_categories
|
| argsort(self, *args, **kwargs)
| Return the indicies that would sort the Categorical.
|
| Parameters
| ----------
| ascending : bool, default True
| Whether the indices should result in an ascending
| or descending sort.
| kind : {'quicksort', 'mergesort', 'heapsort'}, optional
| Sorting algorithm.
| *args, **kwargs:
| passed through to :func:`numpy.argsort`.
|
| Returns
| -------
| argsorted : numpy array
|
| See also
| --------
| numpy.ndarray.argsort
|
| Notes
| -----
| While an ordering is applied to the category values, arg-sorting
| in this context refers more to organizing and grouping together
| based on matching category values. Thus, this function can be
| called on an unordered Categorical instance unlike the functions
| 'Categorical.min'and'Categorical.max'.
|
| Examples
| --------
| >>> pd.Categorical(['b', 'b', 'a', 'c']).argsort()
| array([2, 0, 1, 3])
|
| >>> cat = pd.Categorical(['b', 'b', 'a', 'c'],
| ... categories=['c', 'b', 'a'],
| ... ordered=True)
| >>> cat.argsort()
| array([3, 0, 1, 2])
|
| as_ordered(self, inplace=False)
| Sets the Categorical to be ordered
|
| Parameters
| ----------
| inplace : boolean (default: False)
| Whether ornot to set the ordered attribute inplace orreturn a copy
| of this categorical with ordered set to True
|
| as_unordered(self, inplace=False)
| Sets the Categorical to be unordered
|
| Parameters
| ----------
| inplace : boolean (default: False)
| Whether ornot to set the ordered attribute inplace orreturn a copy
| of this categorical with ordered set to False
|
| astype(self, dtype, copy=True)
| Coerce this type to another dtype
|
| Parameters
| ----------
| dtype : numpy dtype or pandas type
| copy : bool, default True
| By default, astype always returns a newly allocated object.
| If copy is set to False and dtype is categorical, the original
| object is returned.
|
| .. versionadded:: 0.19.0
|
| check_for_ordered(self, op)
| assert that we are ordered
|
| copy(self)
| Copy constructor.
|
| describe(self)
| Describes this Categorical
|
| Returns
| -------
| description: `DataFrame`
| A dataframe with frequency and counts by category.
|
| dropna(self)
| Return the Categorical without null values.
|
| Missing values (-1 in .codes) are detected.
|
| Returns
| -------
| valid : Categorical
|
| equals(self, other)
| Returns True if categorical arrays are equal.
|
| Parameters
| ----------
| other : `Categorical`
|
| Returns
| -------
| are_equal : boolean
|
| fillna(self, value=None, method=None, limit=None)
| Fill NA/NaN values using the specified method.
|
| Parameters
| ----------
| value : scalar, dict, Series
| If a scalar value is passed it is used to fill all missing values.
| Alternatively, a Series or dict can be used to fill in different
| valuesforeach index. The value should not be a list. The
| value(s) passed should either be in the categories or should be
| NaN.
| method : {'backfill', 'bfill', 'pad', 'ffill', None}, default None
| Method to usefor filling holes in reindexed Series
| pad / ffill: propagate last valid observation forward to next valid
| backfill / bfill: use NEXT valid observation to fill gap
| limit : int, default None
| (Not implemented yet for Categorical!)
| If method is specified, this is the maximum number of consecutive
| NaN values to forward/backward fill. In other words, if there is
| a gap with more than this number of consecutive NaNs, it will only
| be partially filled. If method is not specified, this is the
| maximum number of entries along the entire axis where NaNs will be
| filled.
|
| Returns
| -------
| filled : Categorical with NA/NaN filled
|
| get_values(self)
| Return the values.
|
| For internal compatibility with pandas formatting.
|
| Returns
| -------
| values : numpy array
| A numpy array of the same dtype as categorical.categories.dtype or
| Index if datetime / periods
|
| is_dtype_equal(self, other)
| Returns True if categoricals are the same dtype
| same categories, and same ordered
|
| Parameters
| ----------
| other : Categorical
|
| Returns
| -------
| are_equal : boolean
|
| isin(self, values)
| Check whether `values` are contained in Categorical.
|
| Return a boolean NumPy Array showing whether each element in
| the Categorical matches an element in the passed sequence of
| `values` exactly.
|
| Parameters
| ----------
| values : set or list-like
| The sequence of values to test. Passing in a single string will
| raise a ``TypeError``. Instead, turn a single string into a
| list of one element.
|
| Returns
| -------
| isin : numpy.ndarray (bool dtype)
|
| Raises
| ------
| TypeError
| * If `values` is not a set or list-like
|
| See Also
| --------
| pandas.Series.isin : equivalent method on Series
|
| Examples
| --------
|
| >>> s = pd.Categorical(['lama', 'cow', 'lama', 'beetle', 'lama',
| ... 'hippo'])
| >>> s.isin(['cow', 'lama'])
| array([ True, True, True, False, True, False])
|
| Passing a single string as ``s.isin('lama')`` will raise an error. Use
| a list of one element instead:
|
| >>> s.isin(['lama'])
| array([ True, False, True, False, True, False])
|
| isna(self)
| Detect missing values
|
| Missing values (-1 in .codes) are detected.
|
| Returns
| -------
| a boolean array of whether myvalues are null
|
| See also
| --------
| isna : top-level isna
| isnull : alias of isna
| Categorical.notna : boolean inverse of Categorical.isna
|
| isnull = isna(self)
|
| map(self, mapper)
| Map categories using input correspondence (dict, Series, or function).
|
| Maps the categories to new categories. If the mapping correspondence is
| one-to-one the result is a :class:`~pandas.Categorical` which has the
| same order property as the original, otherwise a :class:`~pandas.Index`
| is returned.
|
| If a `dict`or :class:`~pandas.Series` is used any unmapped category is
| mapped to `NaN`. Note that if this happens an :class:`~pandas.Index`
| will be returned.
|
| Parameters
| ----------
| mapper : function, dict, or Series
| Mapping correspondence.
|
| Returns
| -------
| pandas.Categorical or pandas.Index
| Mapped categorical.
|
| See Also
| --------
| CategoricalIndex.map : Apply a mapping correspondence on a
| :class:`~pandas.CategoricalIndex`.
| Index.map : Apply a mapping correspondence on an
| :class:`~pandas.Index`.
| Series.map : Apply a mapping correspondence on a
| :class:`~pandas.Series`.
| Series.apply : Apply more complex functions on a
| :class:`~pandas.Series`.
|
| Examples
| --------
| >>> cat = pd.Categorical(['a', 'b', 'c'])
| >>> cat
| [a, b, c]
| Categories (3, object): [a, b, c]
| >>> cat.map(lambda x: x.upper())
| [A, B, C]
| Categories (3, object): [A, B, C]
| >>> cat.map({'a': 'first', 'b': 'second', 'c': 'third'})
| [first, second, third]
| Categories (3, object): [first, second, third]
|
| If the mapping is one-to-one the ordering of the categories is
| preserved:
|
| >>> cat = pd.Categorical(['a', 'b', 'c'], ordered=True)
| >>> cat
| [a, b, c]
| Categories (3, object): [a < b < c]
| >>> cat.map({'a': 3, 'b': 2, 'c': 1})
| [3, 2, 1]
| Categories (3, int64): [3 < 2 < 1]
|
| If the mapping is not one-to-one an :class:`~pandas.Index` is returned:
|
| >>> cat.map({'a': 'first', 'b': 'second', 'c': 'first'})
| Index(['first', 'second', 'first'], dtype='object')
|
| If a `dict` is used, all unmapped categories are mapped to `NaN`and
| the result is an :class:`~pandas.Index`:
|
| >>> cat.map({'a': 'first', 'b': 'second'})
| Index(['first', 'second', nan], dtype='object')
|
| max(self, numeric_only=None, **kwargs)
| The maximum value of the object.
|
| Only ordered `Categoricals` have a maximum!
|
| Raises
| ------
| TypeError
| If the `Categorical` is not`ordered`.
|
| Returns
| -------
| max : the maximum of this `Categorical`
|
| memory_usage(self, deep=False)
| Memory usage of myvalues
|
| Parameters
| ----------
| deep : bool
| Introspect the data deeply, interrogate
| `object` dtypes forsystem-level memory consumption
|
| Returns
| -------
| bytes used
|
| Notes
| -----
| Memory usage does not include memory consumed by elements that
| are not components of the array if deep=False
|
| See Also
| --------
| numpy.ndarray.nbytes
|
| min(self, numeric_only=None, **kwargs)
| The minimum value of the object.
|
| Only ordered `Categoricals` have a minimum!
|
| Raises
| ------
| TypeError
| If the `Categorical` is not`ordered`.
|
| Returns
| -------
| min : the minimum of this `Categorical`
|
| mode(self)
| Returns the mode(s) of the Categorical.
|
| Always returns `Categorical` even if only one value.
|
| Returns
| -------
| modes : `Categorical` (sorted)
|
| notna(self)
| Inverse of isna
|
| Both missing values (-1 in .codes) and NA as a category are detected as
| null.
|
| Returns
| -------
| a boolean array of whether myvalues are not null
|
| See also
| --------
| notna : top-level notna
| notnull : alias of notna
| Categorical.isna : boolean inverse of Categorical.notna
|
| notnull = notna(self)
|
| put(self, *args, **kwargs)
| Replace specific elements in the Categorical with given values.
|
| ravel(self, order='C')
| Return a flattened (numpy) array.
|
| For internal compatibility with numpy arrays.
|
| Returns
| -------
| raveled : numpy array
|
| remove_categories(self, removals, inplace=False)
| Removes the specified categories.
|
| `removals` must be included in the old categories. Values which were in
| the removed categories will be set to NaN
|
| Raises
| ------
| ValueError
| If the removals are not contained in the categories
|
| Parameters
| ----------
| removals : category or list of categories
| The categories which should be removed.
| inplace : boolean (default: False)
| Whether ornot to remove the categories inplace orreturn a copy of
| this categorical with removed categories.
|
| Returns
| -------
| cat : Categorical with removed categories or None if inplace.
|
| See also
| --------
| rename_categories
| reorder_categories
| add_categories
| remove_unused_categories
| set_categories
|
| remove_unused_categories(self, inplace=False)
| Removes categories which are not used.
|
| Parameters
| ----------
| inplace : boolean (default: False)
| Whether ornot to drop unused categories inplace orreturn a copy of
| this categorical with unused categories dropped.
|
| Returns
| -------
| cat : Categorical with unused categories dropped or None if inplace.
|
| See also
| --------
| rename_categories
| reorder_categories
| add_categories
| remove_categories
| set_categories
|
| rename_categories(self, new_categories, inplace=False)
| Renames categories.
|
| Raises
| ------
| ValueError
| If new categories are list-like anddonot have the same number of
| items than the current categories ordonot validate as categories
|
| Parameters
| ----------
| new_categories : list-like, dict-like or callable
|
| * list-like: all items must be unique and the number of items in
| the new categories must match the existing number of categories.
|
| * dict-like: specifies a mapping from
| old categories to new. Categories not contained in the mapping
| are passed through and extra categories in the mapping are
| ignored.
|
| .. versionadded:: 0.21.0
|
| * callable : a callable that is called on all items in the old
| categories and whose returnvalues comprise the new categories.
|
| .. versionadded:: 0.23.0
|
| .. warning::
|
| Currently, Series are considered list like. In a future version
| of pandas they'll be considered dict-like.
|
| inplace : boolean (default: False)
| Whether or not to rename the categories inplace or return a copy of
| this categorical with renamed categories.
|
| Returns
| -------
| cat : Categorical or None
| With ``inplace=False``, the new categorical is returned.
| With ``inplace=True``, there is no return value.
|
| See also
| --------
| reorder_categories
| add_categories
| remove_categories
| remove_unused_categories
| set_categories
|
| Examples
| --------
| >>> c = Categorical(['a', 'a', 'b'])
| >>> c.rename_categories([0, 1])
| [0, 0, 1]
| Categories (2, int64): [0, 1]
|
| For dict-like ``new_categories``, extra keys are ignored and
| categories not in the dictionary are passed through
|
| >>> c.rename_categories({'a': 'A', 'c': 'C'})
| [A, A, b]
| Categories (2, object): [A, b]
|
| You may also provide a callable to create the new categories
|
| >>> c.rename_categories(lambda x: x.upper())
| [A, A, B]
| Categories (2, object): [A, B]
|
| reorder_categories(self, new_categories, ordered=None, inplace=False)
| Reorders categories as specified in new_categories.
|
| `new_categories` need to include all old categories and no new category
| items.
|
| Raises
| ------
| ValueError
| If the new categories do not contain all old category items or any
| new ones
|
| Parameters
| ----------
| new_categories : Index-like
| The categories in new order.
| ordered : boolean, optional
| Whether or not the categorical is treated as a ordered categorical.
| If not given, do not change the ordered information.
| inplace : boolean (default: False)
| Whether or not to reorder the categories inplace or return a copy of
| this categorical with reordered categories.
|
| Returns
| -------
| cat : Categorical with reordered categories or None if inplace.
|
| See also
| --------
| rename_categories
| add_categories
| remove_categories
| remove_unused_categories
| set_categories
|
| repeat(self, repeats, *args, **kwargs)
| Repeat elements of a Categorical.
|
| See also
| --------
| numpy.ndarray.repeat
|
| searchsorted(self, value, side='left', sorter=None)
| Find indices where elements should be inserted to maintain order.
|
| Find the indices into a sorted Categorical `self` such that, if the
| corresponding elements in `value` were inserted before the indices,
| the order of `self` would be preserved.
|
| Parameters
| ----------
| value : array_like
| Values to insert into `self`.
| side : {'left', 'right'}, optional
| If 'left', the index of the first suitable location found is given.
| If 'right', return the last such index. If there is no suitable
| index, return either 0 or N (where N is the length of `self`).
| sorter : 1-D array_like, optional
| Optional array of integer indices that sort `self` into ascending
| order. They are typically the result of ``np.argsort``.
|
| Returns
| -------
| indices : array of ints
| Array of insertion points with the same shape as `value`.
|
| See Also
| --------
| numpy.searchsorted
|
| Notes
| -----
| Binary search is used to find the required insertion points.
|
| Examples
| --------
|
| >>> x = pd.Series([1, 2, 3])
| >>> x
| 0 1
| 1 2
| 2 3
| dtype: int64
|
| >>> x.searchsorted(4)
| array([3])
|
| >>> x.searchsorted([0, 4])
| array([0, 3])
|
| >>> x.searchsorted([1, 3], side='left')
| array([0, 2])
|
| >>> x.searchsorted([1, 3], side='right')
| array([1, 3])
|
| >>> x = pd.Categorical(['apple', 'bread', 'bread',
| 'cheese', 'milk'], ordered=True)
| [apple, bread, bread, cheese, milk]
| Categories (4, object): [apple < bread < cheese < milk]
|
| >>> x.searchsorted('bread')
| array([1]) # Note: an array, not a scalar
|
| >>> x.searchsorted(['bread'], side='right')
| array([3])
|
| set_categories(self, new_categories, ordered=None, rename=False, inplace=False)
| Sets the categories to the specified new_categories.
|
| `new_categories` can include new categories (which will result in
| unused categories) or remove old categories (which results in values
| set to NaN). If `rename==True`, the categories will simple be renamed
| (less or more items than in old categories will result in values set to
| NaN or in unused categories respectively).
|
| This method can be used to perform more than one action of adding,
| removing, and reordering simultaneously and is therefore faster than
| performing the individual steps via the more specialised methods.
|
| On the other hand this methods does not do checks (e.g., whether the
| old categories are included in the new categories on a reorder), which
| can result in surprising changes, for example when using special string
| dtypes on python3, which does not considers a S1 string equal to a
| single char python string.
|
| Raises
| ------
| ValueError
| If new_categories does not validate as categories
|
| Parameters
| ----------
| new_categories : Index-like
| The categories in new order.
| ordered : boolean, (default: False)
| Whether or not the categorical is treated as a ordered categorical.
| If not given, do not change the ordered information.
| rename : boolean (default: False)
| Whether or not the new_categories should be considered as a rename
| of the old categories or as reordered categories.
| inplace : boolean (default: False)
| Whether or not to reorder the categories inplace or return a copy of
| this categorical with reordered categories.
|
| Returns
| -------
| cat : Categorical with reordered categories or None if inplace.
|
| See also
| --------
| rename_categories
| reorder_categories
| add_categories
| remove_categories
| remove_unused_categories
|
| set_ordered(self, value, inplace=False)
| Sets the ordered attribute to the boolean value
|
| Parameters
| ----------
| value : boolean to set whether this categorical is ordered (True) or
| not (False)
| inplace : boolean (default: False)
| Whether or not to set the ordered attribute inplace or return a copy
| of this categorical with ordered set to the value
|
| shift(self, periods)
| Shift Categorical by desired number of periods.
|
| Parameters
| ----------
| periods : int
| Number of periods to move, can be positive or negative
|
| Returns
| -------
| shifted : Categorical
|
| sort_values(self, inplace=False, ascending=True, na_position='last')
| Sorts the Categorical by category value returning a new
| Categorical by default.
|
| While an ordering is applied to the category values, sorting in this
| context refers more to organizing and grouping together based on
| matching category values. Thus, this function can be called on an
| unordered Categorical instance unlike the functions 'Categorical.min'
| and 'Categorical.max'.
|
| Parameters
| ----------
| inplace : boolean, default False
| Do operation in place.
| ascending : boolean, default True
| Order ascending. Passing False orders descending. The
| ordering parameter provides the method by which the
| category values are organized.
| na_position : {'first', 'last'} (optional, default='last')
| 'first' puts NaNs at the beginning
| 'last' puts NaNs at the end
|
| Returns
| -------
| y : Categorical or None
|
| See Also
| --------
| Categorical.sort
| Series.sort_values
|
| Examples
| --------
| >>> c = pd.Categorical([1, 2, 2, 1, 5])
| >>> c
| [1, 2, 2, 1, 5]
| Categories (3, int64): [1, 2, 5]
| >>> c.sort_values()
| [1, 1, 2, 2, 5]
| Categories (3, int64): [1, 2, 5]
| >>> c.sort_values(ascending=False)
| [5, 2, 2, 1, 1]
| Categories (3, int64): [1, 2, 5]
|
| Inplace sorting can be done as well:
|
| >>> c.sort_values(inplace=True)
| >>> c
| [1, 1, 2, 2, 5]
| Categories (3, int64): [1, 2, 5]
| >>>
| >>> c = pd.Categorical([1, 2, 2, 1, 5])
|
| 'sort_values' behaviour with NaNs. Note that 'na_position'
| is independent of the 'ascending' parameter:
|
| >>> c = pd.Categorical([np.nan, 2, 2, np.nan, 5])
| >>> c
| [NaN, 2.0, 2.0, NaN, 5.0]
| Categories (2, int64): [2, 5]
| >>> c.sort_values()
| [2.0, 2.0, 5.0, NaN, NaN]
| Categories (2, int64): [2, 5]
| >>> c.sort_values(ascending=False)
| [5.0, 2.0, 2.0, NaN, NaN]
| Categories (2, int64): [2, 5]
| >>> c.sort_values(na_position='first')
| [NaN, NaN, 2.0, 2.0, 5.0]
| Categories (2, int64): [2, 5]
| >>> c.sort_values(ascending=False, na_position='first')
| [NaN, NaN, 5.0, 2.0, 2.0]
| Categories (2, int64): [2, 5]
|
| take = take_nd(self, indexer, allow_fill=None, fill_value=None)
|
| take_nd(self, indexer, allow_fill=None, fill_value=None)
| Take elements from the Categorical.
|
| Parameters
| ----------
| indexer : sequence of integers
| allow_fill : bool, default None.
| How to handle negative values in `indexer`.
|
| * False: negative values in `indices` indicate positional indices
| from the right. This is similar to
| :func:`numpy.take`.
|
| * True: negative values in `indices` indicate missing values
| (the default). These values are set to `fill_value`. Any other
| other negative values raise a ``ValueError``.
|
| .. versionchanged:: 0.23.0
|
| Deprecated the default value of `allow_fill`. The deprecated
| default is ``True``. In the future, this will change to
| ``False``.
|
| Returns
| -------
| Categorical
| This Categorical will have the same categories and ordered as
| `self`.
|
| to_dense(self)
| Return my 'dense' representation
|
| For internal compatibility with numpy arrays.
|
| Returns
| -------
| dense : array
|
| tolist(self)
| Return a list of the values.
|
| These are each a scalar type, which is a Python scalar
| (for str, int, float) or a pandas scalar
| (for Timestamp/Timedelta/Interval/Period)
|
| unique(self)
| Return the ``Categorical`` which ``categories`` and ``codes`` are
| unique. Unused categories are NOT returned.
|
| - unordered category: values and categories are sorted by appearance
| order.
| - ordered category: values are sorted by appearance order, categories
| keeps existing order.
|
| Returns
| -------
| unique values : ``Categorical``
|
| Examples
| --------
| An unordered Categorical will return categories in the
| order of appearance.
|
| >>> pd.Categorical(list('baabc'))
| [b, a, c]
| Categories (3, object): [b, a, c]
|
| >>> pd.Categorical(list('baabc'), categories=list('abc'))
| [b, a, c]
| Categories (3, object): [b, a, c]
|
| An ordered Categorical preserves the category ordering.
|
| >>> pd.Categorical(list('baabc'),
| ... categories=list('abc'),
| ... ordered=True)
| [b, a, c]
| Categories (3, object): [a < b < c]
|
| See Also
| --------
| unique
| CategoricalIndex.unique
| Series.unique
|
| value_counts(self, dropna=True)
| Returns a Series containing counts of each category.
|
| Every category will have an entry, even those with a count of 0.
|
| Parameters
| ----------
| dropna : boolean, default True
| Don't include counts of NaN.
|
| Returns
| -------
| counts : Series
|
| See Also
| --------
| Series.value_counts
|
| view(self)
| Return a view of myself.
|
| For internal compatibility with numpy arrays.
|
| Returns
| -------
| view : Categorical
| Returns `self`!
|
| ----------------------------------------------------------------------
| Class methods defined here:
|
| from_codes(codes, categories, ordered=False) from builtins.type
| Make a Categorical type from codes and categories arrays.
|
| This constructor is useful if you already have codes and categories and
| so donot need the (computation intensive) factorization step, which is
| usually done on the constructor.
|
| If your data does not follow this convention, please use the normal
| constructor.
|
| Parameters
| ----------
| codes : array-like, integers
| An integer array, where each integer points to a category in
| categories or -1for NaN
| categories : index-like
| The categories for the categorical. Items need to be unique.
| ordered : boolean, (default False)
| Whether ornot this categorical is treated as a ordered
| categorical. If notgiven, the resulting categorical will be
| unordered.
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| T
|
| base
| compat, we are always our own object
|
| categories
| The categories of this categorical.
|
| Setting assigns new values to each category (effectively a rename of
| each individual category).
|
| The assigned value has to be a list-like object. All items must be
| unique and the number of items in the new categories must be the same
| as the number of items in the old categories.
|
| Assigning to `categories` is a inplace operation!
|
| Raises
| ------
| ValueError
| If the new categories donot validate as categories orif the
| number of new categories is unequal the number of old categories
|
| See also
| --------
| rename_categories
| reorder_categories
| add_categories
| remove_categories
| remove_unused_categories
| set_categories
|
| codes
| The category codes of this categorical.
|
| Level codes are an array if integer which are the positions of the real
| values in the categories array.
|
| There is not setter, use the other categorical methods and the normal item
| setter to change values in the categorical.
|
| dtype
| The :class:`~pandas.api.types.CategoricalDtype`for this instance
|
| itemsize
| return the size of a single category
|
| nbytes
| The number of bytes needed to store this object in memory.
|
| ndim
| Number of dimensions of the Categorical
|
| ordered
| Whether the categories have an ordered relationship
|
| shape
| Shape of the Categorical.
|
| For internal compatibility with numpy arrays.
|
| Returns
| -------
| shape : tuple
|
| size
| return the len of myself
|
| ----------------------------------------------------------------------
| Data and other attributes defined here:
|
| __array_priority__ = 1000
|
| __hash__ = None
|
| ----------------------------------------------------------------------
| Methods inherited from pandas.core.arrays.base.ExtensionArray:
|
| factorize(self, na_sentinel=-1)
| Encode the extension array as an enumerated type.
|
| Parameters
| ----------
| na_sentinel : int, default -1
| Value to use in the `labels` array to indicate missing values.
|
| Returns
| -------
| labels : ndarray
| An integer NumPy array that's an indexer into the original
| ExtensionArray.
| uniques : ExtensionArray
| An ExtensionArray containing the unique values of `self`.
|
| .. note::
|
| uniques will *not* contain an entry for the NA value of
| the ExtensionArray if there are any missing values present
| in `self`.
|
| See Also
| --------
| pandas.factorize : Top-level factorize method that dispatches here.
|
| Notes
| -----
| :meth:`pandas.factorize` offers a `sort` keyword as well.
|
| ----------------------------------------------------------------------
| Data descriptors inherited from pandas.core.arrays.base.ExtensionArray:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
|
| ----------------------------------------------------------------------
| Methods inherited from pandas.core.base.PandasObject:
|
| __sizeof__(self)
| Generates the total memory usage for an object that returns
| either a value or Series of values
|
| ----------------------------------------------------------------------
| Methods inherited from pandas.core.base.StringMixin:
|
| __bytes__(self)
| Return a string representation for a particular object.
|
| Invoked by bytes(obj) in py3 only.
| Yields a bytestring in both py2/py3.
|
| __repr__(self)
| Return a string representation for a particular object.
|
| Yields Bytestring in Py2, Unicode String in py3.
|
| __str__(self)
| Return a string representation for a particular Object
|
| Invoked by str(df) in both py2/py3.
| Yields Bytestring in Py2, Unicode String in py3.
|
| ----------------------------------------------------------------------
| Methods inherited from pandas.core.accessor.DirNamesMixin:
|
| __dir__(self)
| Provide method name lookup and completion
| Only provide 'public' methods
C:\Ana\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: specifying 'categories' or 'ordered' in .astype() is deprecated; pass a CategoricalDtype instead
"""Entry point for launching an IPython kernel.
C:\Ana\lib\site-packages\ipykernel_launcher.py:2: FutureWarning: specifying 'categories' or 'ordered' in .astype() is deprecated; pass a CategoricalDtype instead
0 False
1 False
2 True
dtype: bool
if pd.Series([False, True, False]).any():
print("I am any")
I am any
if pd.Series([False, True, False]).all():
print("I am any")
else:
print("I am all")
I am all
pd.Series([False]).bool()
False
help(pd.Series([False]).bool)
Help on method bool in module pandas.core.generic:
bool() method of pandas.core.series.Series instance
Return the bool of a single element PandasObject.
This must be a boolean scalar value, either True or False. Raise a
ValueError if the PandasObject does not have exactly 1 element, or that
element is not boolean
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· Manus重磅发布:全球首款通用AI代理技术深度解析与实战指南
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· 【自荐】一款简洁、开源的在线白板工具 Drawnix