- 2.25.0 (latest)
- 2.24.0
- 2.23.0
- 2.22.0
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
Series(*args, **kwargs)N-dimensional analogue of DataFrame. Store multi-dimensional in a size-mutable, labeled data structure
Properties
T
Return the transpose, which is by definition self.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(['Ant', 'Bear', 'Cow'])
>>> s
0     Ant
1    Bear
2     Cow
dtype: string
>>> s.T
0     Ant
1    Bear
2     Cow
dtype: string
at
Access a single value for a row/column label pair.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],
...                    index=[4, 5, 6], columns=['A', 'B', 'C'])
>>> df
    A   B   C
4   0   2   3
5   0   4   1
6  10  20  30
<BLANKLINE>
[3 rows x 3 columns]
Get value at specified row/column pair
>>> df.at[4, 'B']
np.int64(2)
Get value at specified row label
>>> df.loc[5].at['B']
np.int64(4)
| Returns | |
|---|---|
| Type | Description | 
| bigframes.core.indexers.AtSeriesIndexer | Indexers object. | 
blob
API documentation for blob property.
dt
Accessor object for datetime-like properties of the Series values.
Examples:
>>> import bigframes.pandas as bpd
>>> import pandas as pd
>>> bpd.options.display.progress_bar = None
>>> seconds_series = bpd.Series(pd.date_range("2000-01-01", periods=3, freq="s"))
>>> seconds_series
0    2000-01-01 00:00:00
1    2000-01-01 00:00:01
2    2000-01-01 00:00:02
dtype: timestamp`us][pyarrow]`
>>> seconds_series.dt.second
0    0
1    1
2    2
dtype: Int64
>>> hours_series = bpd.Series(pd.date_range("2000-01-01", periods=3, freq="h"))
>>> hours_series
0    2000-01-01 00:00:00
1    2000-01-01 01:00:00
2    2000-01-01 02:00:00
dtype: timestamp`us][pyarrow]`
>>> hours_series.dt.hour
0    0
1    1
2    2
dtype: Int64
>>> quarters_series = bpd.Series(pd.date_range("2000-01-01", periods=3, freq="QE"))
>>> quarters_series
0    2000-03-31 00:00:00
1    2000-06-30 00:00:00
2    2000-09-30 00:00:00
dtype: timestamp`us][pyarrow]`
>>> quarters_series.dt.quarter
0    1
1    2
2    3
dtype: Int64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.operations.datetimes.DatetimeMethods | An accessor containing datetime methods. | 
dtype
Return the dtype object of the underlying data.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3])
>>> s.dtype
Int64Dtype()
dtypes
Return the dtypes in the DataFrame.
This returns a Series with the data type of each column. The result's index is the original DataFrame's columns. Columns with mixed types aren't supported yet in BigQuery DataFrames.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'float': [1.0], 'int': [1], 'string': ['foo']})
>>> df.dtypes
float             Float64
int                 Int64
string    string[pyarrow]
dtype: object
| Returns | |
|---|---|
| Type | Description | 
| pandas.Series | A *pandas* Series with the data type of each column. | 
empty
Indicates whether Series/DataFrame is empty.
True if Series/DataFrame is entirely empty (no items), meaning any of the axes are of length 0.
| Returns | |
|---|---|
| Type | Description | 
| bool | If Series/DataFrame is empty, return True, if not return False. | 
geo
Accessor object for geography properties of the Series values.
| Returns | |
|---|---|
| Type | Description | 
| bigframes.geopandas.geoseries.GeoSeries | An accessor containing geography methods. | 
hasnans
Return True if there are any NaNs.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3, None])
>>> s
0     1.0
1     2.0
2     3.0
3    <NA>
dtype: Float64
>>> s.hasnans
np.True_
iat
Access a single value for a row/column pair by integer position.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],
...                    columns=['A', 'B', 'C'])
>>> df
    A       B       C
0   0       2       3
1   0       4       1
2   10      20      30
<BLANKLINE>
[3 rows x 3 columns]
Get value at specified row/column pair
>>> df.iat[1, 2]
np.int64(1)
Get value within a series
>>> df.loc[0].iat[1]
np.int64(2)
| Returns | |
|---|---|
| Type | Description | 
| bigframes.core.indexers.IatSeriesIndexer | Indexers object. | 
iloc
Purely integer-location based indexing for selection by position.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},
...               {'a': 100, 'b': 200, 'c': 300, 'd': 400},
...               {'a': 1000, 'b': 2000, 'c': 3000, 'd': 4000}]
>>> df = bpd.DataFrame(mydict)
>>> df
      a     b     c     d
0     1     2     3     4
1   100   200   300   400
2  1000  2000  3000  4000
<BLANKLINE>
[3 rows x 4 columns]
Indexing just the rows
With a scalar integer.
>>> type(df.iloc[0])
<class 'pandas.core.series.Series'>
>>> df.iloc[0]
a    1
b    2
c    3
d    4
Name: 0, dtype: Int64
With a list of integers.
>>> df.iloc[0]
a    1
b    2
c    3
d    4
Name: 0, dtype: Int64
>>> type(df.iloc[[0]])
<class 'bigframes.dataframe.DataFrame'>
>>> df.iloc[[0, 1]]
    a    b    c    d
0    1    2    3    4
1  100  200  300  400
<BLANKLINE>
[2 rows x 4 columns]
With a slice object.
>>> df.iloc[:3]
      a     b     c     d
0     1     2     3     4
1   100   200   300   400
2  1000  2000  3000  4000
<BLANKLINE>
[3 rows x 4 columns]
Indexing both axes
You can mix the indexer types for the index and columns. Use : to select the entire axis.
With scalar integers.
>>> df.iloc[0, 1]
np.int64(2)
| Returns | |
|---|---|
| Type | Description | 
| bigframes.core.indexers.IlocSeriesIndexer | Purely integer-location Indexers. | 
index
The index (axis labels) of the Series.
The index of a Series is used to label and identify each element of the underlying data. The index can be thought of as an immutable ordered set (technically a multi-set, as it may contain duplicate labels), and is used to index and align data.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can access the index of a Series via index property.
>>> df = bpd.DataFrame({'Name': ['Alice', 'Bob', 'Aritra'],
...                     'Age': [25, 30, 35],
...                     'Location': ['Seattle', 'New York', 'Kona']},
...                    index=([10, 20, 30]))
>>> s = df["Age"]
>>> s
10    25
20    30
30    35
Name: Age, dtype: Int64
>>> s.index # doctest: +ELLIPSIS
Index([10, 20, 30], dtype='Int64')
>>> s.index.values
array([10, 20, 30])
Let's try setting a multi-index case reflect via index property.
>>> df1 = df.set_index(["Name", "Location"])
>>> s1 = df1["Age"]
>>> s1
Name    Location
Alice   Seattle     25
Bob     New York    30
Aritra  Kona        35
Name: Age, dtype: Int64
>>> s1.index # doctest: +ELLIPSIS
MultiIndex([( 'Alice',  'Seattle'),
            (   'Bob', 'New York'),
            ('Aritra',     'Kona')],
          names=['Name', 'Location'])
>>> s1.index.values
array([('Alice', 'Seattle'), ('Bob', 'New York'), ('Aritra', 'Kona')],
      dtype=object)
| Returns | |
|---|---|
| Type | Description | 
| Index | The index object of the Series. | 
is_monotonic_decreasing
Return boolean if values in the object are monotonically decreasing.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([3, 2, 2, 1])
>>> s.is_monotonic_decreasing
np.True_
>>> s = bpd.Series([1, 2, 3])
>>> s.is_monotonic_decreasing
np.False_
| Returns | |
|---|---|
| Type | Description | 
| bool | Boolean. | 
is_monotonic_increasing
Return boolean if values in the object are monotonically increasing.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 2])
>>> s.is_monotonic_increasing
np.True_
>>> s = bpd.Series([3, 2, 1])
>>> s.is_monotonic_increasing
np.False_
| Returns | |
|---|---|
| Type | Description | 
| bool | Boolean. | 
list
API documentation for list property.
loc
Access a group of rows and columns by label(s) or a boolean array.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([[1, 2], [4, 5], [7, 8]],
...                    index=['cobra', 'viper', 'sidewinder'],
...                    columns=['max_speed', 'shield'])
>>> df
            max_speed  shield
cobra               1       2
viper               4       5
sidewinder          7       8
<BLANKLINE>
[3 rows x 2 columns]
Single label. Note this returns the row as a Series.
>>> df.loc['viper']
max_speed    4
shield       5
Name: viper, dtype: Int64
List of labels. Note using [[]] returns a DataFrame.
>>> df.loc[['viper', 'sidewinder']]
            max_speed  shield
viper               4       5
sidewinder          7       8
<BLANKLINE>
[2 rows x 2 columns]
Slice with labels for row and single label for column. As mentioned
above, note that both the start and stop of the slice are included.
>>> df.loc['cobra', 'shield']
np.int64(2)
Index (same behavior as df.reindex)
>>> df.loc[bpd.Index(["cobra", "viper"], name="foo")]
      max_speed  shield
cobra          1       2
viper          4       5
<BLANKLINE>
[2 rows x 2 columns]
Conditional that returns a boolean Series with column labels specified
>>> df.loc[df['shield'] > 6, ['max_speed']]
            max_speed
sidewinder          7
<BLANKLINE>
[1 rows x 1 columns]
Multiple conditional using | that returns a boolean Series
>>> df.loc[(df['max_speed'] > 4) | (df['shield'] < 5)]
            max_speed  shield
cobra               1       2
sidewinder          7       8
<BLANKLINE>
[2 rows x 2 columns]
Please ensure that each condition is wrapped in parentheses ().
Set value for an entire column
>>> df.loc[:, 'max_speed'] = 30
>>> df
            max_speed  shield
cobra              30       2
viper              30       5
sidewinder         30       8
<BLANKLINE>
[3 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| bigframes.core.indexers.LocSeriesIndexer | Indexers object. | 
name
Return the name of the Series.
The name of a Series becomes its index or column name if it is used to form a DataFrame. It is also used whenever displaying the Series using the interpreter.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
For a Series:
>>> s = bpd.Series([1, 2, 3], dtype="Int64", name='Numbers')
>>> s
0    1
1    2
2    3
Name: Numbers, dtype: Int64
>>> s.name
'Numbers'
>>> s.name = "Integers"
>>> s
0    1
1    2
2    3
Name: Integers, dtype: Int64
If the Series is part of a DataFrame:
>>> df = bpd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
>>> df
   col1  col2
0     1     3
1     2     4
<BLANKLINE>
[2 rows x 2 columns]
>>> s = df["col1"]
>>> s.name
'col1'
| Returns | |
|---|---|
| Type | Description | 
| hashable object | The name of the Series, also the column name if part of a DataFrame. | 
ndim
Return an int representing the number of axes / array dimensions.
| Returns | |
|---|---|
| Type | Description | 
| int | Return 1 if Series. Otherwise return 2 if DataFrame. | 
plot
Make plots of Series.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> ser = bpd.Series([1, 2, 3, 3])
>>> plot = ser.plot(kind='hist', title="My plot")
>>> plot
<Axes: title={'center': 'My plot'}, ylabel='Frequency'>
| Returns | |
|---|---|
| Type | Description | 
| bigframes.operations.plotting.PlotAccessor | An accessor making plots. | 
query_job
BigQuery job metadata for the most recent query.
shape
Return a tuple of the shape of the underlying data.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 4, 9, 16])
>>> s.shape
(4,)
>>> s = bpd.Series(['Alice', 'Bob', bpd.NA])
>>> s.shape
(3,)
size
Return the number of elements in the underlying data.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
For Series:
>>> s = bpd.Series(['Ant', 'Bear', 'Cow'])
>>> s
0     Ant
1    Bear
2     Cow
dtype: string
>>> s.size
3
For Index:
>>> idx = bpd.Index(bpd.Series([1, 2, 3]))
>>> idx.size
3
| Returns | |
|---|---|
| Type | Description | 
| int | Return the number of elements in the underlying data. | 
str
Vectorized string functions for Series and Index.
NAs stay NA unless handled otherwise by a particular method. Patterned after Python’s string methods, with some inspiration from R’s stringr package.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(["A_Str_Series"])
>>> s
0    A_Str_Series
dtype: string
>>> s.str.lower()
0    a_str_series
dtype: string
>>> s.str.replace("_", "")
0    AStrSeries
dtype: string
| Returns | |
|---|---|
| Type | Description | 
| bigframes.operations.strings.StringMethods | An accessor containing string methods. | 
struct
Accessor object for struct properties of the Series values.
| Returns | |
|---|---|
| Type | Description | 
| bigframes.operations.structs.StructAccessor | An accessor containing struct methods. | 
values
Return Series as ndarray or ndarray-like depending on the dtype.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> bpd.Series([1, 2, 3]).values
array([1, 2, 3])
>>> bpd.Series(list('aabc')).values
array(['a', 'a', 'b', 'c'], dtype=object)
| Returns | |
|---|---|
| Type | Description | 
| numpy.ndarray or ndarray-like | Values in the Series. | 
Methods
__add__
__add__(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesGet addition of Series and other, element-wise, using operator +.
Equivalent to Series.add(other).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1.5, 2.6], index=['elk', 'moose'])
>>> s
elk      1.5
moose    2.6
dtype: Float64
You can add a scalar.
>>> s + 1.5
elk      3.0
moose    4.1
dtype: Float64
You can add another Series with index aligned.
>>> delta = bpd.Series([1.5, 2.6], index=['elk', 'moose'])
>>> s + delta
elk      3.0
moose    5.2
dtype: Float64
Adding any mis-aligned index will result in invalid values.
>>> delta = bpd.Series([1.5, 2.6], index=['moose', 'bison'])
>>> s + delta
elk      <NA>
moose     4.1
bison    <NA>
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to be added to the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of adding otherto Series. | 
__and__
__and__(other: bool | int | bigframes.series.Series) -> bigframes.series.SeriesGet bitwise AND of Series and other, element-wise, using operator &.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([0, 1, 2, 3])
You can operate with a scalar.
>>> s & 6
0    0
1    0
2    2
3    2
dtype: Int64
You can operate with another Series.
>>> s1 = bpd.Series([5, 6, 7, 8])
>>> s & s1
0    0
1    0
2    2
3    0
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to bitwise AND with the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
__array__
__array__(dtype=None) -> numpy.ndarrayReturns the values as NumPy array.
Equivalent to Series.to_numpy(dtype).
Users should not call this directly. Rather, it is invoked by
numpy.array and numpy.asarray.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> import numpy as np
>>> ser = bpd.Series([1, 2, 3])
>>> np.asarray(ser)
array([1, 2, 3])
| Parameter | |
|---|---|
| Name | Description | 
| dtype | str or numpy.dtype, optionalThe dtype to use for the resulting NumPy array. By default, the dtype is inferred from the data. | 
| Returns | |
|---|---|
| Type | Description | 
| numpy.ndarray | The values in the series converted to a numpy.ndarraywith the specified dtype. | 
__array_ufunc__
__array_ufunc__(
    ufunc: numpy.ufunc, method: str, *inputs, **kwargs
) -> bigframes.series.SeriesUsed to support numpy ufuncs. See: https://numpy.org/doc/stable/reference/ufuncs.html
__floordiv__
__floordiv__(
    other: float | int | bigframes.series.Series,
) -> bigframes.series.SeriesGet integer division of Series by other, using arithmetic operator //.
Equivalent to Series.floordiv(other).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can divide by a scalar:
>>> s = bpd.Series([15, 30, 45])
>>> s // 2
0     7
1    15
2    22
dtype: Int64
You can also divide by another DataFrame:
>>> divisor = bpd.Series([3, 4, 4])
>>> s // divisor
0     5
1     7
2    11
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to divide the Series by. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the integer division. | 
__getitem__
__getitem__(indexer)Gets the specified index from the Series.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([15, 30, 45])
>>> s[1]
np.int64(30)
>>> s[0:2]
0    15
1    30
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| indexer | int or sliceIndex or slice of indices. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series or Value | Value(s) at the requested index(es). | 
__invert__
__invert__() -> bigframes.series.SeriesReturns the logical inversion (binary NOT) of the Series, element-wise using operator ````.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> ser = bpd.Series([True, False, True])
>>> `ser`
0    False
1     True
2    False
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The inverted values in the series. | 
__len__
__len__()Returns number of values in the Series, serves len operator.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3])
>>> len(s)
3
__matmul__
__matmul__(other)Matrix multiplication using binary @ operator.
__mod__
__mod__(other) -> bigframes.series.SeriesGet modulo of Series with other, element-wise, using operator %.
Equivalent to Series.mod(other).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can modulo with a scalar:
>>> s = bpd.Series([1, 2, 3])
>>> s % 3
0    1
1    2
2    0
dtype: Int64
You can also modulo with another Series:
>>> modulo = bpd.Series([3, 3, 3])
>>> s % modulo
0    1
1    2
2    0
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to modulo the Series by. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the modulo. | 
__mul__
__mul__(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesGet multiplication of Series with other, element-wise, using operator *.
Equivalent to Series.mul(other).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can multiply with a scalar:
>>> s = bpd.Series([1, 2, 3])
>>> s * 3
0    3
1    6
2    9
dtype: Int64
You can also multiply with another Series:
>>> s1 = bpd.Series([2, 3, 4])
>>> s * s1
0     2
1     6
2    12
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to multiply with the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the multiplication. | 
__or__
__or__(other: bool | int | bigframes.series.Series) -> bigframes.series.SeriesGet bitwise XOR of Series and other, element-wise, using operator ^.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([0, 1, 2, 3])
You can operate with a scalar.
>>> s ^ 6
0    6
1    7
2    4
3    5
dtype: Int64
You can operate with another Series.
>>> s1 = bpd.Series([5, 6, 7, 8])
>>> s ^ s1
0     5
1     7
2     5
3    11
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to bitwise XOR with the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
__pow__
__pow__(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesGet exponentiation of Series with other, element-wise, using operator
**.
Equivalent to Series.pow(other).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can exponentiate with a scalar:
>>> s = bpd.Series([1, 2, 3])
>>> s ** 2
0    1
1    4
2    9
dtype: Int64
You can also exponentiate with another Series:
>>> exponent = bpd.Series([3, 2, 1])
>>> s ** exponent
0    1
1    4
2    3
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to exponentiate the Series with. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the exponentiation. | 
__radd__
__radd__(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesGet addition of Series and other, element-wise, using operator +.
Equivalent to Series.radd(other).
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to which Series should be added. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of adding Series to other. | 
__rand__
__rand__(other: bool | int | bigframes.series.Series) -> bigframes.series.SeriesGet bitwise AND of Series and other, element-wise, using operator &.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([0, 1, 2, 3])
You can operate with a scalar.
>>> s & 6
0    0
1    0
2    2
3    2
dtype: Int64
You can operate with another Series.
>>> s1 = bpd.Series([5, 6, 7, 8])
>>> s & s1
0    0
1    0
2    2
3    0
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to bitwise AND with the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
__rfloordiv__
__rfloordiv__(
    other: float | int | bigframes.series.Series,
) -> bigframes.series.SeriesGet integer division of other by Series, using arithmetic operator //.
Equivalent to Series.rfloordiv(other).
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to divide by the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the integer division. | 
__rmatmul__
__rmatmul__(other)Matrix multiplication using binary @ operator.
__rmod__
__rmod__(other) -> bigframes.series.SeriesGet modulo of other with Series, element-wise, using operator %.
Equivalent to Series.rmod(other).
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to modulo by the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the modulo. | 
__rmul__
__rmul__(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesGet multiplication of other with Series, element-wise, using operator *.
Equivalent to Series.rmul(other).
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to multiply the Series with. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the multiplication. | 
__ror__
__ror__(other: bool | int | bigframes.series.Series) -> bigframes.series.SeriesGet bitwise XOR of Series and other, element-wise, using operator ^.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([0, 1, 2, 3])
You can operate with a scalar.
>>> s ^ 6
0    6
1    7
2    4
3    5
dtype: Int64
You can operate with another Series.
>>> s1 = bpd.Series([5, 6, 7, 8])
>>> s ^ s1
0     5
1     7
2     5
3    11
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to bitwise XOR with the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
__rpow__
__rpow__(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesGet exponentiation of other with Series, element-wise, using operator
**.
Equivalent to Series.rpow(other).
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to exponentiate with the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the exponentiation. | 
__rsub__
__rsub__(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesGet subtraction of Series from other, element-wise, using operator -.
Equivalent to Series.rsub(other).
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to subtract the Series from. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of subtraction. | 
__rtruediv__
__rtruediv__(
    other: float | int | bigframes.series.Series,
) -> bigframes.series.SeriesGet division of other by Series, element-wise, using operator /.
Equivalent to Series.rtruediv(other).
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to divide by the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the division. | 
__sub__
__sub__(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesGet subtraction of other from Series, element-wise, using operator -.
Equivalent to Series.sub(other).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1.5, 2.6], index=['elk', 'moose'])
>>> s
elk      1.5
moose    2.6
dtype: Float64
You can subtract a scalar.
>>> s - 1.5
elk      0.0
moose    1.1
dtype: Float64
You can subtract another Series with index aligned.
>>> delta = bpd.Series([0.5, 1.0], index=['elk', 'moose'])
>>> s - delta
elk      1.0
moose    1.6
dtype: Float64
Adding any mis-aligned index will result in invalid values.
>>> delta = bpd.Series([0.5, 1.0], index=['moose', 'bison'])
>>> s - delta
elk      <NA>
moose     2.1
bison    <NA>
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to subtract from the Series. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of subtraction. | 
__truediv__
__truediv__(
    other: float | int | bigframes.series.Series,
) -> bigframes.series.SeriesGet division of Series by other, element-wise, using operator /.
Equivalent to Series.truediv(other).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can multiply with a scalar:
>>> s = bpd.Series([1, 2, 3])
>>> s / 2
0    0.5
1    1.0
2    1.5
dtype: Float64
You can also multiply with another Series:
>>> denominator = bpd.Series([2, 3, 4])
>>> s / denominator
0         0.5
1    0.666667
2        0.75
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| other | scalar or SeriesObject to divide the Series by. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the division. | 
abs
abs() -> bigframes.series.SeriesReturn a Series/DataFrame with absolute numeric value of each element.
This function only applies to elements that are all numeric.
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | A Series or DataFrame containing the absolute value of each element. | 
add
add(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn addition of Series and other, element-wise (binary operator add).
Equivalent to series + other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 2, 3, bpd.NA])
>>> a
0       1
1       2
2       3
3    <NA>
dtype: Int64
>>> b = bpd.Series([10, 20, 30, 40])
>>> b
0     10
1     20
2     30
3     40
dtype: Int64
>>> a.add(b)
0      11
1      22
2      33
3    <NA>
dtype: Int64
You can also use the mathematical operator +:
>>> a + b
0      11
1      22
2      33
3    <NA>
dtype: Int64
Adding two Series with explicit indexes:
>>> a = bpd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])
>>> b = bpd.Series([10, 20, 30, 40], index=['a', 'b', 'd', 'e'])
>>> a.add(b)
a      11
b      22
c    <NA>
d      34
e    <NA>
dtype: Int64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
add_prefix
add_prefix(prefix: str, axis: int | str | None = None) -> bigframes.series.SeriesPrefix labels with string prefix.
For Series, the row labels are prefixed. For DataFrame, the column labels are prefixed.
| Parameters | |
|---|---|
| Name | Description | 
| prefix | strThe string to add before each label. | 
| axis | int or str or None, default None
 | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | New Series or DataFrame with updated labels. | 
add_suffix
add_suffix(suffix: str, axis: int | str | None = None) -> bigframes.series.SeriesSuffix labels with string suffix.
For Series, the row labels are suffixed. For DataFrame, the column labels are suffixed.
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | New Series or DataFrame with updated labels. | 
agg
agg(
    func: typing.Union[str, typing.Sequence[str]]
) -> typing.Union[typing.Any, bigframes.series.Series]Aggregate using one or more operations over the specified axis.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3, 4])
>>> s
0    1
1    2
2    3
3    4
dtype: Int64
>>> s.agg('min')
np.int64(1)
>>> s.agg(['min', 'max'])
min    1
max    4
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| func | functionFunction to use for aggregating the data. Accepted combinations are: string function name, list of function names, e.g.  | 
| Returns | |
|---|---|
| Type | Description | 
| scalar or bigframes.pandas.Series | Aggregated results. | 
aggregate
aggregate(
    func: typing.Union[str, typing.Sequence[str]]
) -> typing.Union[typing.Any, bigframes.series.Series]Aggregate using one or more operations over the specified axis.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3, 4])
>>> s
0    1
1    2
2    3
3    4
dtype: Int64
>>> s.agg('min')
np.int64(1)
>>> s.agg(['min', 'max'])
min    1
max    4
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| func | functionFunction to use for aggregating the data. Accepted combinations are: string function name, list of function names, e.g.  | 
| Returns | |
|---|---|
| Type | Description | 
| scalar or bigframes.pandas.Series | Aggregated results. | 
all
all() -> boolReturn whether all elements are True, potentially over an axis.
Returns True unless there at least one element within a Series or along a DataFrame axis that is False or equivalent (e.g. zero or empty).
| Returns | |
|---|---|
| Type | Description | 
| scalar or bigframes.pandas.Series | If level is specified, then, Series is returned; otherwise, scalar is returned. | 
any
any() -> boolReturn whether any element is True, potentially over an axis.
Returns False unless there is at least one element within a series or along a Dataframe axis that is True or equivalent (e.g. non-zero or non-empty).
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
For Series input, the output is a scalar indicating whether any element is True.
>>> bpd.Series([False, False]).any()
np.False_
>>> bpd.Series([True, False]).any()
np.True_
>>> bpd.Series([], dtype="float64").any()
np.False_
>>> bpd.Series([np.nan]).any()
np.False_
| Returns | |
|---|---|
| Type | Description | 
| scalar or bigframes.pandas.Series | If level is specified, then, Series is returned; otherwise, scalar is returned. | 
apply
apply(
    func, by_row: typing.Union[typing.Literal["compat"], bool] = "compat"
) -> bigframes.series.SeriesInvoke function on values of a Series.
Can be ufunc (a NumPy function that applies to the entire Series) or a
Python function that only works on single values. If it is an arbitrary
python function then converting it into a remote_function is recommended.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
For applying arbitrary python function a remote_function is recommended.
Let's use reuse=False flag to make sure a new remote_function
is created every time we run the following code, but you can skip it
to potentially reuse a previously deployed remote_function from
the same user defined function.
>>> @bpd.remote_function(reuse=False)
... def minutes_to_hours(x: int) -> float:
...     return x/60
>>> minutes = bpd.Series([0, 30, 60, 90, 120])
>>> minutes
0      0
1     30
2     60
3     90
4    120
dtype: Int64
>>> hours = minutes.apply(minutes_to_hours)
>>> hours
0    0.0
1    0.5
2    1.0
3    1.5
4    2.0
dtype: Float64
To turn a user defined function with external package dependencies into
a remote_function, you would provide the names of the packages via
packages param.
>>> @bpd.remote_function(
...     reuse=False,
...     packages=["cryptography"],
... )
... def get_hash(input: str) -> str:
...     from cryptography.fernet import Fernet
...
...     # handle missing value
...     if input is None:
...         input = ""
...
...     key = Fernet.generate_key()
...     f = Fernet(key)
...     return f.encrypt(input.encode()).decode()
>>> names = bpd.Series(["Alice", "Bob"])
>>> hashes = names.apply(get_hash)
You could return an array output from the remote function.
>>> @bpd.remote_function(reuse=False)
... def text_analyzer(text: str) -> list[int]:
...     words = text.count(" ") + 1
...     periods = text.count(".")
...     exclamations = text.count("!")
...     questions = text.count("?")
...     return [words, periods, exclamations, questions]
>>> texts = bpd.Series([
...     "The quick brown fox jumps over the lazy dog.",
...     "I love this product! It's amazing.",
...     "Hungry? Wanna eat? Lets go!"
... ])
>>> features = texts.apply(text_analyzer)
>>> features
0    [9 1 0 0]
1    [6 1 1 0]
2    [5 0 1 2]
dtype: list<item: int64>[pyarrow]
Simple vectorized functions, lambdas or ufuncs can be applied directly
with by_row=False.
>>> nums = bpd.Series([1, 2, 3, 4])
>>> nums
0    1
1    2
2    3
3    4
dtype: Int64
>>> nums.apply(lambda x: x*x + 2*x + 1, by_row=False)
0     4
1     9
2    16
3    25
dtype: Int64
>>> def is_odd(num):
...     return num % 2 == 1
>>> nums.apply(is_odd, by_row=False)
0     True
1    False
2     True
3    False
dtype: boolean
>>> nums.apply(np.log, by_row=False)
0         0.0
1    0.693147
2    1.098612
3    1.386294
dtype: Float64
| Parameters | |
|---|---|
| Name | Description | 
| func | functionBigFrames DataFrames  | 
| by_row | False or "compat", default "compat"If  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | A new Series with values representing the return value of the funcapplied to each element of the original Series. | 
area
area(
    x: typing.Optional[typing.Hashable] = None,
    y: typing.Optional[typing.Hashable] = None,
    stacked: bool = True,
    **kwargs
)Draw a stacked area plot. An area plot displays quantitative data visually.
This function calls pandas.plot to generate a plot with a random sample
of items. For consistent results, the random sampling is reproducible.
Use the sampling_random_state parameter to modify the sampling seed.
Examples:
Draw an area plot based on basic business metrics:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame(
...     {
...         'sales': [3, 2, 3, 9, 10, 6],
...         'signups': [5, 5, 6, 12, 14, 13],
...         'visits': [20, 42, 28, 62, 81, 50],
...     },
...     index=["01-31", "02-28", "03-31", "04-30", "05-31", "06-30"]
... )
>>> ax = df.plot.area()
Area plots are stacked by default. To produce an unstacked plot,
pass stacked=False:
>>> ax = df.plot.area(stacked=False)
Draw an area plot for a single column:
>>> ax = df.plot.area(y='sales')
Draw with a different x:
>>> df = bpd.DataFrame({
...     'sales': [3, 2, 3],
...     'visits': [20, 42, 28],
...     'day': [1, 2, 3],
... })
>>> ax = df.plot.area(x='day')
| Parameters | |
|---|---|
| Name | Description | 
| x | label or position, optionalCoordinates for the X axis. By default uses the index. | 
| y | label or position, optionalColumn to plot. By default uses all columns. | 
| stacked | bool, default TrueArea plots are stacked by default. Set to False to create a unstacked plot. | 
| sampling_n | int, default 100Number of random items for plotting. | 
| sampling_random_state | int, default 0Seed for random number generator. | 
| Returns | |
|---|---|
| Type | Description | 
| matplotlib.axes.Axes or numpy.ndarray | Area plot, or array of area plots if subplots is True. | 
argmax
argmax() -> intReturn int position of the largest value in the series.
If the maximum is achieved in multiple locations, the first row position is returned.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Consider dataset containing cereal calories.
>>> s = bpd.Series({'Corn Flakes': 100.0, 'Almond Delight': 110.0,
...                 'Cinnamon Toast Crunch': 120.0, 'Cocoa Puff': 110.0})
>>> s
Corn Flakes              100.0
Almond Delight           110.0
Cinnamon Toast Crunch    120.0
Cocoa Puff               110.0
dtype: Float64
>>> s.argmax()
np.int64(2)
>>> s.argmin()
np.int64(0)
The maximum cereal calories is the third element and the minimum cereal calories is the first element, since series is zero-indexed.
| Returns | |
|---|---|
| Type | Description | 
| int | Row position of the maximum value. | 
argmin
argmin() -> intReturn int position of the smallest value in the Series.
If the minimum is achieved in multiple locations, the first row position is returned.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Consider dataset containing cereal calories.
>>> s = bpd.Series({'Corn Flakes': 100.0, 'Almond Delight': 110.0,
...                 'Cinnamon Toast Crunch': 120.0, 'Cocoa Puff': 110.0})
>>> s
Corn Flakes              100.0
Almond Delight           110.0
Cinnamon Toast Crunch    120.0
Cocoa Puff               110.0
dtype: Float64
>>> s.argmax()
np.int64(2)
>>> s.argmin()
np.int64(0)
The maximum cereal calories is the third element and the minimum cereal calories is the first element, since series is zero-indexed.
| Returns | |
|---|---|
| Type | Description | 
| int | Row position of the minimum value. | 
astype
astype(
    dtype: typing.Union[
        typing.Literal[
            "boolean",
            "Float64",
            "Int64",
            "int64[pyarrow]",
            "string",
            "string[pyarrow]",
            "timestamp[us, tz=UTC][pyarrow]",
            "timestamp[us][pyarrow]",
            "date32[day][pyarrow]",
            "time64[us][pyarrow]",
            "decimal128(38, 9)[pyarrow]",
            "decimal256(76, 38)[pyarrow]",
            "binary[pyarrow]",
        ],
        pandas.core.arrays.boolean.BooleanDtype,
        pandas.core.arrays.floating.Float64Dtype,
        pandas.core.arrays.integer.Int64Dtype,
        pandas.core.arrays.string_.StringDtype,
        pandas.core.dtypes.dtypes.ArrowDtype,
        geopandas.array.GeometryDtype,
    ],
    *,
    errors: typing.Literal["raise", "null"] = "raise"
) -> bigframes.series.SeriesCast a pandas object to a specified dtype dtype.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Create a DataFrame:
>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = bpd.DataFrame(data=d)
>>> df.dtypes
col1    Int64
col2    Int64
dtype: object
Cast all columns to Float64:
>>> df.astype('Float64').dtypes
col1    Float64
col2    Float64
dtype: object
Create a series of type Int64:
>>> ser = bpd.Series([2023010000246789, 1624123244123101, 1054834234120101], dtype='Int64')
>>> ser
0    2023010000246789
1    1624123244123101
2    1054834234120101
dtype: Int64
Convert to Float64 type:
>>> ser.astype('Float64')
0    2023010000246789.0
1    1624123244123101.0
2    1054834234120101.0
dtype: Float64
Convert to pd.ArrowDtype(pa.timestamp("us", tz="UTC")) type:
>>> ser.astype("timestamp[us, tz=UTC][pyarrow]")
0    2034-02-08 11:13:20.246789+00:00
1    2021-06-19 17:20:44.123101+00:00
2    2003-06-05 17:30:34.120101+00:00
dtype: timestamp[us, tz=UTC][pyarrow]
Note that this is equivalent of using to_datetime with unit='us':
>>> bpd.to_datetime(ser, unit='us', utc=True)
0    2034-02-08 11:13:20.246789+00:00
1    2021-06-19 17:20:44.123101+00:00
2    2003-06-05 17:30:34.120101+00:00
dtype: timestamp[us, tz=UTC][pyarrow]
Convert pd.ArrowDtype(pa.timestamp("us", tz="UTC")) type to Int64 type:
>>> timestamp_ser = ser.astype("timestamp[us, tz=UTC][pyarrow]")
>>> timestamp_ser.astype('Int64')
0    2023010000246789
1    1624123244123101
2    1054834234120101
dtype: Int64
| Parameters | |
|---|---|
| Name | Description | 
| dtype | str or pandas.ExtensionDtypeA dtype supported by BigQuery DataFrame include  | 
| errors | {'raise', 'null'}, default 'raise'Control raising of exceptions on invalid data for provided dtype. If 'raise', allow exceptions to be raised if any value fails cast If 'null', will assign null value if value fails cast | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame | A BigQuery DataFrame. | 
autocorr
autocorr(lag: int = 1) -> floatCompute the lag-N autocorrelation.
This method computes the Pearson correlation between the Series and its shifted self.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([0.25, 0.5, 0.2, -0.05])
>>> s.autocorr()  # doctest: +ELLIPSIS
np.float64(0.10355263309024067)
>>> s.autocorr(lag=2)
np.float64(-1.0)
If the Pearson correlation is not well defined, then 'NaN' is returned.
>>> s = bpd.Series([1, 0, 0, 0])
>>> s.autocorr()
np.float64(nan)
| Parameter | |
|---|---|
| Name | Description | 
| lag | int, default 1Number of lags to apply before performing autocorrelation. | 
| Returns | |
|---|---|
| Type | Description | 
| float | The Pearson correlation between self and self.shift(lag). | 
bar
bar(
    x: typing.Optional[typing.Hashable] = None,
    y: typing.Optional[typing.Hashable] = None,
    **kwargs
)Draw a vertical bar plot.
This function calls pandas.plot to generate a plot with a random sample
of items. For consistent results, the random sampling is reproducible.
Use the sampling_random_state parameter to modify the sampling seed.
Examples:
Basic plot.
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'lab':['A', 'B', 'C'], 'val':[10, 30, 20]})
>>> ax = df.plot.bar(x='lab', y='val', rot=0)
Plot a whole dataframe to a bar plot. Each column is assigned a distinct color, and each row is nested in a group along the horizontal axis.
>>> speed = [0.1, 17.5, 40, 48, 52, 69, 88]
>>> lifespan = [2, 8, 70, 1.5, 25, 12, 28]
>>> index = ['snail', 'pig', 'elephant',
...          'rabbit', 'giraffe', 'coyote', 'horse']
>>> df = bpd.DataFrame({'speed': speed, 'lifespan': lifespan}, index=index)
>>> ax = df.plot.bar(rot=0)
Plot stacked bar charts for the DataFrame.
>>> ax = df.plot.bar(stacked=True)
If you don’t like the default colours, you can specify how you’d like each column to be colored.
>>> axes = df.plot.bar(
...     rot=0, subplots=True, color={"speed": "red", "lifespan": "green"}
... )
| Parameters | |
|---|---|
| Name | Description | 
| x | label or position, optionalAllows plotting of one column versus another. If not specified, the index of the DataFrame is used. | 
| y | label or position, optionalAllows plotting of one column versus another. If not specified, all numerical columns are used. | 
| Returns | |
|---|---|
| Type | Description | 
| matplotlib.axes.Axes or numpy.ndarray | Area plot, or array of area plots if subplots is True. | 
between
between(left, right, inclusive="both")Return boolean Series equivalent to left <= series <= right.
This function returns a boolean vector containing True wherever the
corresponding Series element is between the boundary values left and
right. NA values are treated as False.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
Boundary values are included by default:
>>> s = bpd.Series([2, 0, 4, 8, np.nan])
>>> s.between(1, 4)
0     True
1    False
2     True
3    False
4     <NA>
dtype: boolean
With inclusive set to "neither" boundary values are excluded:
>>> s.between(1, 4, inclusive="neither")
0     True
1    False
2    False
3    False
4     <NA>
dtype: boolean
left and right can be any scalar value:
>>> s = bpd.Series(['Alice', 'Bob', 'Carol', 'Eve'])
>>> s.between('Anna', 'Daniel')
0    False
1     True
2     True
3    False
dtype: boolean
| Parameters | |
|---|---|
| Name | Description | 
| left | scalar or list-likeLeft boundary. | 
| right | scalar or list-likeRight boundary. | 
| inclusive | {"both", "neither", "left", "right"}Include boundaries. Whether to set each bound as closed or open. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series representing whether each element is between left and right (inclusive). | 
bfill
bfill(*, limit: typing.Optional[int] = None) -> bigframes.series.SeriesFill NA/NaN values by using the next valid observation to fill the gap.
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series or None | Object with missing values filled. | 
cache
cache()Materializes the Series to a temporary table.
Useful if the series will be used multiple times, as this will avoid recomputating the shared intermediate value.
| Returns | |
|---|---|
| Type | Description | 
| Series | Self | 
case_when
case_when(caselist) -> bigframes.series.SeriesReplace values where the conditions are True.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> c = bpd.Series([6, 7, 8, 9], name="c")
>>> a = bpd.Series([0, 0, 1, 2])
>>> b = bpd.Series([0, 3, 4, 5])
>>> c.case_when(
...     caselist=[
...         (a.gt(0), a),  # condition, replacement
...         (b.gt(0), b),
...     ]
... )
0    6
1    3
2    1
3    2
Name: c, dtype: Int64
See also:
- bigframes.pandas.Series.mask: Replace values where the condition is True.
| Parameter | |
|---|---|
| Name | Description | 
| caselist | A list of tuples of conditions and expected replacementsTakes the form:  | 
clip
clip(lower, upper)Trim values at input threshold(s).
Assigns values outside boundary to boundary values. Thresholds can be singular values or array like, and in the latter case the clipping is performed element-wise in the specified axis.
| Parameters | |
|---|---|
| Name | Description | 
| lower | float or array-like, default NoneMinimum threshold value. All values below this threshold will be set to it. A missing threshold (e.g NA) will not clip the value. | 
| upper | float or array-like, default NoneMaximum threshold value. All values above this threshold will be set to it. A missing threshold (e.g NA) will not clip the value. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series. | 
combine
combine(other, func) -> bigframes.series.SeriesCombine the Series with a Series or scalar according to func.
Combine the Series and other using func to perform elementwise
selection for combined Series.
fill_value is assumed when value is missing at some index
from one of the two objects being combined.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
Consider 2 Datasets s1 and s2 containing
highest clocked speeds of different birds.
>>> s1 = bpd.Series({'falcon': 330.0, 'eagle': 160.0})
>>> s1
falcon    330.0
eagle     160.0
dtype: Float64
>>> s2 = bpd.Series({'falcon': 345.0, 'eagle': 200.0, 'duck': 30.0})
>>> s2
falcon    345.0
eagle     200.0
duck       30.0
dtype: Float64
Now, to combine the two datasets and view the highest speeds of the birds across the two datasets
>>> s1.combine(s2, np.maximum)
falcon    345.0
eagle     200.0
duck       <NA>
dtype: Float64
| Parameters | |
|---|---|
| Name | Description | 
| other | Series or scalarThe value(s) to be combined with the  | 
| func | functionBigFrames DataFrames  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of combining the Series with the other object. | 
combine_first
combine_first(other: bigframes.series.Series) -> bigframes.series.SeriesUpdate null elements with value in the same location in 'other'.
Combine two Series objects by filling null values in one Series with non-null values from the other Series. Result index will be the union of the two indexes.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s1 = bpd.Series([1, np.nan])
>>> s2 = bpd.Series([3, 4, 5])
>>> s1.combine_first(s2)
0    1.0
1    4.0
2    5.0
dtype: Float64
Null values still persist if the location of that null value
does not exist in other
>>> s1 = bpd.Series({'falcon': np.nan, 'eagle': 160.0})
>>> s2 = bpd.Series({'eagle': 200.0, 'duck': 30.0})
>>> s1.combine_first(s2)
falcon     <NA>
eagle     160.0
duck       30.0
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| other | SeriesThe value(s) to be used for filling null values. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of combining the provided Series with the other object. | 
copy
copy() -> bigframes.series.SeriesMake a copy of this object's indices and data.
A new object will be created with a copy of the calling object's data and indices. Modifications to the data or indices of the copy will not be reflected in the original object.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Modification in the original Series will not affect the copy Series:
>>> s = bpd.Series([1, 2], index=["a", "b"])
>>> s
a    1
b    2
dtype: Int64
>>> s_copy = s.copy()
>>> s_copy
a    1
b    2
dtype: Int64
>>> s.loc['b'] = 22
>>> s
a     1
b    22
dtype: Int64
>>> s_copy
a    1
b    2
dtype: Int64
Modification in the original DataFrame will not affect the copy DataFrame:
>>> df = bpd.DataFrame({'a': [1, 3], 'b': [2, 4]})
>>> df
   a  b
0  1  2
1  3  4
<BLANKLINE>
[2 rows x 2 columns]
>>> df_copy = df.copy()
>>> df_copy
   a  b
0  1  2
1  3  4
<BLANKLINE>
[2 rows x 2 columns]
>>> df.loc[df["b"] == 2, "b"] = 22
>>> df
   a   b
0  1  22
1  3   4
<BLANKLINE>
[2 rows x 2 columns]
>>> df_copy
   a  b
0  1  2
1  3  4
<BLANKLINE>
[2 rows x 2 columns]
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | Object type matches caller. | 
corr
corr(other: bigframes.series.Series, method="pearson", min_periods=None) -> floatCompute the correlation with the other Series. Non-number values are ignored in the computation.
Uses the "Pearson" method of correlation. Numbers are converted to float before calculation, so the result may be unstable.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s1 = bpd.Series([.2, .0, .6, .2])
>>> s2 = bpd.Series([.3, .6, .0, .1])
>>> s1.corr(s2)
np.float64(-0.8510644963469901)
>>> s1 = bpd.Series([1, 2, 3], index=[0, 1, 2])
>>> s2 = bpd.Series([1, 2, 3], index=[2, 1, 0])
>>> s1.corr(s2)
np.float64(-1.0)
| Parameters | |
|---|---|
| Name | Description | 
| other | SeriesThe series with which this is to be correlated. | 
| method | string, default "pearson"Correlation method to use - currently only "pearson" is supported. | 
| min_periods | int, default NoneThe minimum number of observations needed to return a result. Non-default values are not yet supported, so a result will be returned for at least two observations. | 
| Returns | |
|---|---|
| Type | Description | 
| float | Will return NaN if there are fewer than two numeric pairs, either series has a variance or covariance of zero, or any input value is infinite. | 
count
count() -> intReturn number of non-NA/null observations in the Series.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([0.0, 1.0, bpd.NA])
>>> s
0     0.0
1     1.0
2    <NA>
dtype: Float64
>>> s.count()
np.int64(2)
| Returns | |
|---|---|
| Type | Description | 
| int or bigframes.pandas.Series (if level specified) | Number of non-null values in the Series. | 
cov
cov(other: bigframes.series.Series) -> floatCompute covariance with Series, excluding missing values.
The two Series objects are not required to be the same length and
will be aligned internally before the covariance is calculated.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s1 = bpd.Series([0.90010907, 0.13484424, 0.62036035])
>>> s2 = bpd.Series([0.12528585, 0.26962463, 0.51111198])
>>> s1.cov(s2)
np.float64(-0.01685762652715874)
| Parameter | |
|---|---|
| Name | Description | 
| other | SeriesSeries with which to compute the covariance. | 
| Returns | |
|---|---|
| Type | Description | 
| float | Covariance between Series and other normalized by N-1 (unbiased estimator). | 
cummax
cummax() -> bigframes.series.SeriesReturn cumulative maximum over a DataFrame or Series axis.
Returns a DataFrame or Series of the same size containing the cumulative maximum.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([2, np.nan, 5, -1, 0])
>>> s
0     2.0
1    <NA>
2     5.0
3    -1.0
4     0.0
dtype: Float64
By default, NA values are ignored.
>>> s.cummax()
0     2.0
1    <NA>
2     5.0
3     5.0
4     5.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Return cumulative maximum of scalar or Series. | 
cummin
cummin() -> bigframes.series.SeriesReturn cumulative minimum over a DataFrame or Series axis.
Returns a DataFrame or Series of the same size containing the cumulative minimum.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([2, np.nan, 5, -1, 0])
>>> s
0     2.0
1    <NA>
2     5.0
3    -1.0
4     0.0
dtype: Float64
By default, NA values are ignored.
>>> s.cummin()
0     2.0
1    <NA>
2     2.0
3    -1.0
4    -1.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Return cumulative minimum of scalar or Series. | 
cumprod
cumprod() -> bigframes.series.SeriesReturn cumulative product over a DataFrame or Series axis.
Returns a DataFrame or Series of the same size containing the cumulative product.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([2, np.nan, 5, -1, 0])
>>> s
0     2.0
1    <NA>
2     5.0
3    -1.0
4     0.0
dtype: Float64
By default, NA values are ignored.
>>> s.cumprod()
0     2.0
1    <NA>
2    10.0
3   -10.0
4     0.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Return cumulative sum of scalar or Series. | 
cumsum
cumsum() -> bigframes.series.SeriesReturn cumulative sum over a DataFrame or Series axis.
Returns a DataFrame or Series of the same size containing the cumulative sum.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([2, np.nan, 5, -1, 0])
>>> s
0     2.0
1    <NA>
2     5.0
3    -1.0
4     0.0
dtype: Float64
By default, NA values are ignored.
>>> s.cumsum()
0     2.0
1    <NA>
2     7.0
3     6.0
4     6.0
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| axis | {0 or 'index', 1 or 'columns'}, default 0The index or the name of the axis. 0 is equivalent to None or 'index'. For  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Return cumulative sum of scalar or Series. | 
diff
diff(periods: int = 1) -> bigframes.series.SeriesFirst discrete difference of element.
Calculates the difference of a Series element compared with another element in the Series (default is element in previous row).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Difference with previous row
>>> s = bpd.Series([1, 1, 2, 3, 5, 8])
>>> s.diff()
0    <NA>
1       0
2       1
3       1
4       2
5       3
dtype: Int64
Difference with 3rd previous row
>>> s.diff(periods=3)
0    <NA>
1    <NA>
2    <NA>
3       2
4       4
5       6
dtype: Int64
Difference with following row
>>> s.diff(periods=-1)
0       0
1      -1
2      -1
3      -2
4      -3
5    <NA>
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| periods | int, default 1Periods to shift for calculating difference, accepts negative values. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | First differences of the Series. | 
div
div(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn floating division of Series and other, element-wise (binary operator truediv).
Equivalent to series / other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.divide(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
divide
divide(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn floating division of Series and other, element-wise (binary operator truediv).
Equivalent to series / other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.divide(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
divmod
divmod(other) -> typing.Tuple[bigframes.series.Series, bigframes.series.Series]Return integer division and modulo of Series and other, element-wise (binary operator divmod).
Equivalent to divmod(series, other).
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.divmod(b)
(a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64,
a     0.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64)
| Returns | |
|---|---|
| Type | Description | 
| Tuple[bigframes.pandas.Series, bigframes.pandas.Series] | The result of the operation. The result is always consistent with (floordiv, mod) (though pandas may not). | 
dot
dot(other)Compute the dot product between the Series and the columns of other.
This method computes the dot product between the Series and another one, or the Series and each columns of a DataFrame, or the Series and each columns of an array.
It can also be called using self @ other in Python >= 3.5.
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([0, 1, 2, 3])
>>> other = bpd.Series([-1, 2, -3, 4])
>>> s.dot(other)
np.int64(8)
You can also use the operator @ for the dot product:
>>> s @ other
np.int64(8)
| Parameter | |
|---|---|
| Name | Description | 
| other | SeriesThe other object to compute the dot product with its columns. | 
| Returns | |
|---|---|
| Type | Description | 
| scalar, bigframes.pandas.Series or numpy.ndarray | Return the dot product of the Series and other if other is a Series, the Series of the dot product of Series and each rows of other if other is a DataFrame or a numpy.ndarray between the Series and each columns of the numpy array. | 
drop
drop(
    labels: typing.Any = None,
    *,
    axis: typing.Union[int, str] = 0,
    index: typing.Any = None,
    columns: typing.Union[typing.Hashable, typing.Iterable[typing.Hashable]] = None,
    level: typing.Optional[typing.Union[str, int]] = None
) -> bigframes.series.SeriesReturn Series with specified index labels removed.
Remove elements of a Series based on specifying the index labels. When using a multi-index, labels on different levels can be removed by specifying the level.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(data=np.arange(3), index=['A', 'B', 'C'])
>>> s
A    0
B    1
C    2
dtype: Int64
Drop labels B and C:
>>> s.drop(labels=['B', 'C'])
A    0
dtype: Int64
Drop 2nd level label in MultiIndex Series:
>>> import pandas as pd
>>> midx = pd.MultiIndex(levels=[['llama', 'cow', 'falcon'],
...                              ['speed', 'weight', 'length']],
...                      codes=[[0, 0, 0, 1, 1, 1, 2, 2, 2],
...                             [0, 1, 2, 0, 1, 2, 0, 1, 2]])
>>> s = bpd.Series([45, 200, 1.2, 30, 250, 1.5, 320, 1, 0.3],
...               index=midx)
>>> s
llama   speed      45.0
        weight    200.0
        length      1.2
cow     speed      30.0
        weight    250.0
        length      1.5
falcon  speed     320.0
        weight      1.0
        length      0.3
dtype: Float64
>>> s.drop(labels='weight', level=1)
llama   speed      45.0
        length      1.2
cow     speed      30.0
        length      1.5
falcon  speed     320.0
        length      0.3
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| labels | single label or list-likeIndex labels to drop. | 
| Exceptions | |
|---|---|
| Type | Description | 
| KeyError | If none of the labels are found in the index. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series or None | Series with specified index labels removed or None if inplace=True. | 
drop_duplicates
drop_duplicates(*, keep: str = "first") -> bigframes.series.SeriesReturn Series with duplicate values removed.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Generate a Series with duplicated entries.
>>> s = bpd.Series(['llama', 'cow', 'llama', 'beetle', 'llama', 'hippo'],
...                name='animal')
>>> s
0     llama
1       cow
2     llama
3    beetle
4     llama
5     hippo
Name: animal, dtype: string
With the 'keep' parameter, the selection behaviour of duplicated values can be changed. The value 'first' keeps the first occurrence for each set of duplicated entries. The default value of keep is 'first'.
>>> s.drop_duplicates()
0     llama
1       cow
3    beetle
5     hippo
Name: animal, dtype: string
The value 'last' for parameter 'keep' keeps the last occurrence for each set of duplicated entries.
>>> s.drop_duplicates(keep='last')
1       cow
3    beetle
4     llama
5     hippo
Name: animal, dtype: string
The value False for parameter 'keep' discards all sets of duplicated entries.
>>> s.drop_duplicates(keep=False)
1       cow
3    beetle
5     hippo
Name: animal, dtype: string
| Parameter | |
|---|---|
| Name | Description | 
| keep | {'first', 'last', Method to handle dropping duplicates: 'first' : Drop duplicates except for the first occurrence. 'last' : Drop duplicates except for the last occurrence.  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series with duplicates dropped or None if inplace=True. | 
droplevel
droplevel(
    level: typing.Union[str, int, typing.Sequence[typing.Union[str, int]]],
    axis: int | str = 0,
)Return Series with requested index / column level(s) removed.
| Parameters | |
|---|---|
| Name | Description | 
| level | int, str, or list-likeIf a string is given, must be the name of a level If list-like, elements must be names or positional indexes of levels. | 
| axis | {0 or 'index', 1 or 'columns'}, default 0For  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series with requested index / column level(s) removed. | 
dropna
dropna(
    *,
    axis: int = 0,
    inplace: bool = False,
    how: typing.Optional[str] = None,
    ignore_index: bool = False
) -> bigframes.series.SeriesReturn a new Series with missing values removed.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
Drop NA values from a Series:
>>> ser = bpd.Series([1., 2., np.nan])
>>> ser
0     1.0
1     2.0
2    <NA>
dtype: Float64
>>> ser.dropna()
0    1.0
1    2.0
dtype: Float64
Empty strings are not considered NA values. None is considered an NA value.
>>> ser = bpd.Series(['2', bpd.NA, '', None, 'I stay'], dtype='object')
>>> ser
0         2
1      <NA>
2
3      <NA>
4    I stay
dtype: string
>>> ser.dropna()
0         2
2
4    I stay
dtype: string
| Parameters | |
|---|---|
| Name | Description | 
| axis | 0 or 'index'Unused. Parameter needed for compatibility with DataFrame. | 
| inplace | bool, default FalseUnsupported, do not set. | 
| how | str, optionalNot in use. Kept for compatibility. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series with NA entries dropped from it. | 
duplicated
duplicated(keep: str = "first") -> bigframes.series.SeriesIndicate duplicate Series values.
Duplicated values are indicated as True values in the resulting
Series. Either all duplicates, all except the first or all except the
last occurrence of duplicates can be indicated.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
By default, for each set of duplicated values, the first occurrence is set on False and all others on True:
>>> animals = bpd.Series(['llama', 'cow', 'llama', 'beetle', 'llama'])
>>> animals.duplicated()
0    False
1    False
2     True
3    False
4     True
dtype: boolean
which is equivalent to
>>> animals.duplicated(keep='first')
0    False
1    False
2     True
3    False
4     True
dtype: boolean
By using 'last', the last occurrence of each set of duplicated values is set on False and all others on True:
>>> animals.duplicated(keep='last')
0     True
1    False
2     True
3    False
4    False
dtype: boolean
By setting keep on False, all duplicates are True:
>>> animals.duplicated(keep=False)
0     True
1    False
2     True
3    False
4     True
dtype: boolean
| Parameter | |
|---|---|
| Name | Description | 
| keep | {'first', 'last', False}, default 'first'Method to handle dropping duplicates: 'first' : Mark duplicates as  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series indicating whether each value has occurred in the preceding values. | 
eq
eq(other: object) -> bigframes.series.SeriesReturn equal of Series and other, element-wise (binary operator eq).
Equivalent to other == series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.eq(b)
a    True
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| Series | The result of the operation. | 
equals
equals(
    other: typing.Union[bigframes.series.Series, bigframes.dataframe.DataFrame]
) -> boolTest whether two objects contain the same elements.
This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.
The row/column index do not need to have the same type, as long as the values are considered equal. Corresponding columns must be of the same dtype.
| Parameter | |
|---|---|
| Name | Description | 
| other | Series or DataFrameThe other Series or DataFrame to be compared with the first. | 
| Returns | |
|---|---|
| Type | Description | 
| bool | True if all elements are the same in both objects, False otherwise. | 
expanding
expanding(min_periods: int = 1) -> bigframes.core.window.WindowProvide expanding window calculations.
| Parameter | |
|---|---|
| Name | Description | 
| min_periods | int, default 1Minimum number of observations in window required to have a value; otherwise, result is  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.core.window.Window | Expandingsubclass. | 
explode
explode(*, ignore_index: typing.Optional[bool] = False) -> bigframes.series.SeriesTransform each element of a list-like to a row.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([[1, 2, 3], [], [3, 4]])
>>> s
0    [1 2 3]
1         []
2      [3 4]
dtype: list<item: int64>[pyarrow]
>>> s.explode()
0       1
0       2
0       3
1    <NA>
2       3
2       4
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| ignore_index | bool, default FalseIf True, the resulting index will be labeled 0, 1, …, n - 1. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Exploded lists to rows; index will be duplicated for these rows. | 
ffill
ffill(*, limit: typing.Optional[int] = None) -> bigframes.series.SeriesFill NA/NaN values by propagating the last valid observation to next valid.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([[np.nan, 2, np.nan, 0],
...                     [3, 4, np.nan, 1],
...                     [np.nan, np.nan, np.nan, np.nan],
...                     [np.nan, 3, np.nan, 4]],
...                    columns=list("ABCD")).astype("Float64")
>>> df
      A     B     C     D
0  <NA>   2.0  <NA>   0.0
1   3.0   4.0  <NA>   1.0
2  <NA>  <NA>  <NA>  <NA>
3  <NA>   3.0  <NA>   4.0
<BLANKLINE>
[4 rows x 4 columns]
Fill NA/NaN values in DataFrames:
>>> df.ffill()
      A    B     C    D
0  <NA>  2.0  <NA>  0.0
1   3.0  4.0  <NA>  1.0
2   3.0  4.0  <NA>  1.0
3   3.0  3.0  <NA>  4.0
<BLANKLINE>
[4 rows x 4 columns]
Fill NA/NaN values in Series:
>>> series = bpd.Series([1, np.nan, 2, 3])
>>> series.ffill()
0    1.0
1    1.0
2    2.0
3    3.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series or None | Object with missing values filled. | 
fillna
fillna(value=None) -> bigframes.series.SeriesFill NA/NaN values using the specified method.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([np.nan, 2, np.nan, -1])
>>> s
0    <NA>
1     2.0
2    <NA>
3    -1.0
dtype: Float64
Replace all NA elements with 0s.
>>> s.fillna(0)
0    0.0
1    2.0
2    0.0
3   -1.0
dtype: Float64
You can use fill values from another Series:
>>> s_fill = bpd.Series([11, 22, 33])
>>> s.fillna(s_fill)
0    11.0
1     2.0
2    33.0
3    -1.0
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| value | scalar, dict, Series, or DataFrame, default NoneValue to use to fill holes (e.g. 0). | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series or None | Object with missing values filled or None. | 
filter
filter(
    items: typing.Optional[typing.Iterable] = None,
    like: typing.Optional[str] = None,
    regex: typing.Optional[str] = None,
    axis: typing.Optional[typing.Union[str, int]] = None,
) -> bigframes.series.SeriesSubset the dataframe rows or columns according to the specified index labels.
Note that this routine does not filter a dataframe on its contents. The filter is applied to the labels of the index.
| Parameters | |
|---|---|
| Name | Description | 
| items | list-likeKeep labels from axis which are in items. | 
| like | strKeep labels from axis for which "like in label == True". | 
| regex | str (regular expression)Keep labels from axis for which re.search(regex, label) == True. | 
| axis | {0 or 'index', 1 or 'columns', None}, default NoneThe axis to filter on, expressed either as an index (int) or axis name (str). By default this is the info axis, 'columns' for DataFrame. For  | 
| Exceptions | |
|---|---|
| Type | Description | 
| ValueError | If value provided is not exactly one of items,like, orregex. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | Same type as input object. | 
floordiv
floordiv(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn integer division of Series and other, element-wise (binary operator floordiv).
Equivalent to series // other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.floordiv(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
ge
ge(other) -> bigframes.series.SeriesGet 'greater than or equal to' of Series and other, element-wise (binary operator ge).
Equivalent to series >= other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.ge(b)
a    True
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
groupby
groupby(
    by: typing.Union[
        typing.Hashable,
        bigframes.series.Series,
        typing.Sequence[typing.Union[typing.Hashable, bigframes.series.Series]],
    ] = None,
    axis=0,
    level: typing.Optional[
        typing.Union[int, str, typing.Sequence[int], typing.Sequence[str]]
    ] = None,
    as_index: bool = True,
    *,
    dropna: bool = True
) -> bigframes.core.groupby.SeriesGroupByGroup Series using a mapper or by a Series of columns.
A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
You can group by a named index level.
>>> s = bpd.Series([380, 370., 24., 26.],
...                index=["Falcon", "Falcon", "Parrot", "Parrot"],
...                name="Max Speed")
>>> s.index.name="Animal"
>>> s
Animal
Falcon    380.0
Falcon    370.0
Parrot     24.0
Parrot     26.0
Name: Max Speed, dtype: Float64
>>> s.groupby("Animal").mean()
Animal
Falcon    375.0
Parrot     25.0
Name: Max Speed, dtype: Float64
You can also group by more than one index levels.
>>> import pandas as pd
>>> s = bpd.Series([380, 370., 24., 26.],
...                index=pd.MultiIndex.from_tuples(
...                    [("Falcon", "Clear"),
...                     ("Falcon", "Cloudy"),
...                     ("Parrot", "Clear"),
...                     ("Parrot", "Clear")],
...                    names=["Animal", "Sky"]),
...                name="Max Speed")
>>> s
Animal    Sky
Falcon  Clear     380.0
        Cloudy    370.0
Parrot  Clear      24.0
        Clear      26.0
Name: Max Speed, dtype: Float64
>>> s.groupby("Animal").mean()
Animal
Falcon    375.0
Parrot     25.0
Name: Max Speed, dtype: Float64
>>> s.groupby("Sky").mean()
Sky
Clear     143.333333
Cloudy         370.0
Name: Max Speed, dtype: Float64
>>> s.groupby(["Animal", "Sky"]).mean()
Animal  Sky
Falcon  Clear     380.0
        Cloudy    370.0
Parrot  Clear      25.0
Name: Max Speed, dtype: Float64
You can also group by values in a Series provided the index matches with the original series.
>>> df = bpd.DataFrame({'Animal': ['Falcon', 'Falcon', 'Parrot', 'Parrot'],
...                     'Max Speed': [380., 370., 24., 26.],
...                     'Age': [10., 20., 4., 6.]})
>>> df
Animal  Max Speed   Age
0  Falcon      380.0  10.0
1  Falcon      370.0  20.0
2  Parrot       24.0   4.0
3  Parrot       26.0   6.0
<BLANKLINE>
[4 rows x 3 columns]
>>> df['Max Speed'].groupby(df['Animal']).mean()
Animal
Falcon    375.0
Parrot     25.0
Name: Max Speed, dtype: Float64
>>> df['Age'].groupby(df['Animal']).max()
Animal
Falcon    20.0
Parrot     6.0
Name: Age, dtype: Float64
| Parameters | |
|---|---|
| Name | Description | 
| by | mapping, function, label, pd.Grouper or list of such, default NoneUsed to determine the groups for the groupby. If  | 
| axis | {0 or 'index', 1 or 'columns'}, default 0Split along rows (0) or columns (1). For  | 
| level | int, level name, or sequence of such, default NoneIf the axis is a MultiIndex (hierarchical), group by a particular level or levels. Do not specify both  | 
| as_index | bool, default TrueReturn object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively "SQL-style" grouped output. This argument has no effect on filtrations (see the "filtrations in the user guide"  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.core.groupby.SeriesGroupBy | Returns a groupby object that contains information about the groups. | 
gt
gt(other) -> bigframes.series.SeriesReturn Greater than of series and other, element-wise (binary operator gt).
Equivalent to series <= other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.gt(b)
a    False
b     <NA>
c     <NA>
d     <NA>
e     <NA>
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
head
head(n: int = 5) -> bigframes.series.SeriesReturn the first n rows.
This function returns the first n rows for the object based
on position. It is useful for quickly testing if your object
has the right type of data in it.
For negative values of n, this function returns
all rows except the last |n| rows, equivalent to df[:n].
If n is larger than the number of rows, this function returns all rows.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'animal': ['alligator', 'bee', 'falcon', 'lion',
...                     'monkey', 'parrot', 'shark', 'whale', 'zebra']})
>>> df
    animal
0  alligator
1        bee
2     falcon
3       lion
4     monkey
5     parrot
6      shark
7      whale
8      zebra
<BLANKLINE>
[9 rows x 1 columns]
Viewing the first 5 lines:
>>> df.head()
    animal
0  alligator
1        bee
2     falcon
3       lion
4     monkey
<BLANKLINE>
[5 rows x 1 columns]
Viewing the first n lines (three in this case):
>>> df.head(3)
    animal
0  alligator
1        bee
2     falcon
<BLANKLINE>
[3 rows x 1 columns]
For negative values of n:
>>> df.head(-3)
    animal
0  alligator
1        bee
2     falcon
3       lion
4     monkey
5     parrot
<BLANKLINE>
[6 rows x 1 columns]
| Parameter | |
|---|---|
| Name | Description | 
| n | int, default 5Default 5. Number of rows to select. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | The first nrows of the caller object. | 
hist
hist(by: typing.Optional[typing.Sequence[str]] = None, bins: int = 10, **kwargs)Draw one histogram of the DataFrame’s columns.
A histogram is a representation of the distribution of data.
This function groups the values of all given Series in the DataFrame
into bins and draws all bins in one matplotlib.axes.Axes.
This is useful when the DataFrame's Series are in a similar scale.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame(np.random.randint(1, 7, 6000), columns=['one'])
>>> df['two'] = np.random.randint(1, 7, 6000) + np.random.randint(1, 7, 6000)
>>> ax = df.plot.hist(bins=12, alpha=0.5)
| Parameters | |
|---|---|
| Name | Description | 
| by | str or sequence, optionalColumn in the DataFrame to group by. It is not supported yet. | 
| bins | int, default 10Number of histogram bins to be used. | 
| Returns | |
|---|---|
| Type | Description | 
| class | matplotlib.AxesSubplot: A histogram plot. | 
idxmax
idxmax() -> typing.HashableReturn the row label of the maximum value.
If multiple values equal the maximum, the first row label with that value is returned.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(data=[1, None, 4, 3, 4],
...                index=['A', 'B', 'C', 'D', 'E'])
>>> s
A     1.0
B    <NA>
C     4.0
D     3.0
E     4.0
dtype: Float64
>>> s.idxmax()
'C'
| Returns | |
|---|---|
| Type | Description | 
| Index | Label of the maximum value. | 
idxmin
idxmin() -> typing.HashableReturn the row label of the minimum value.
If multiple values equal the minimum, the first row label with that value is returned.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(data=[1, None, 4, 1],
...                index=['A', 'B', 'C', 'D'])
>>> s
A     1.0
B    <NA>
C     4.0
D     1.0
dtype: Float64
>>> s.idxmin()
'A'
| Returns | |
|---|---|
| Type | Description | 
| Index | Label of the minimum value. | 
interpolate
interpolate(method: str = "linear") -> bigframes.series.SeriesFill NaN values using an interpolation method.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
Filling in NaN in a Series via linear interpolation.
>>> s = bpd.Series([0, 1, np.nan, 3])
>>> s
0     0.0
1     1.0
2    <NA>
3     3.0
dtype: Float64
>>> s.interpolate()
0    0.0
1    1.0
2    2.0
3    3.0
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| method | str, default 'linear'Interpolation technique to use. Only 'linear' supported. 'linear': Ignore the index and treat the values as equally spaced. This is the only method supported on MultiIndexes. 'index', 'values': use the actual numerical values of the index. 'pad': Fill in NaNs using existing values. 'nearest', 'zero', 'slinear': Emulates  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Returns the same object type as the caller, interpolated at some or all NaNvalues | 
isin
isin(values) -> "Series" | NoneWhether elements in Series are contained in values.
Return a boolean Series showing whether each element in the Series matches an element in the passed sequence of values exactly.
Examples:>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(['llama', 'cow', 'llama', 'beetle', 'llama',
...                 'hippo'], name='animal')
>>> s
0     llama
1       cow
2     llama
3    beetle
4     llama
5     hippo
Name: animal, dtype: string
To invert the boolean values, use the ` operator:
>>> `s.isin`(['cow', 'llama'])
0    False
1    False
2    False
3     True
4    False
5     True
Name: animal, dtype: boolean
Passing a single string as s.isin('llama') will raise an error. Use a list of one element instead:
>>> s.isin(['llama'])
0     True
1    False
2     True
3    False
4     True
5    False
Name: animal, dtype: boolean
Strings and integers are distinct and are therefore not comparable:
>>> bpd.Series([1]).isin(['1'])
0    False
dtype: boolean
>>> bpd.Series([1.1]).isin(['1.1'])
0    False
dtype: boolean
| Parameter | |
|---|---|
| Name | Description | 
| values | list-likeThe sequence of values to test. Passing in a single string will raise a TypeError. Instead, turn a single string into a list of one element. | 
| Exceptions | |
|---|---|
| Type | Description | 
| TypeError | If input is not list-like. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series of booleans indicating if each element is in values. | 
isna
isna() -> bigframes.series.SeriesDetect missing values.
Return a boolean same-sized object indicating if the values are NA.
NA values get mapped to True values. Everything else gets mapped to
False values. Characters such as empty strings '' or
numpy.inf are not considered NA values.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> import numpy as np
>>> df = bpd.DataFrame(dict(
...         age=[5, 6, np.nan],
...         born=[bpd.NA, "1940-04-25", "1940-04-25"],
...         name=['Alfred', 'Batman', ''],
...         toy=[None, 'Batmobile', 'Joker'],
... ))
>>> df
    age        born    name        toy
0   5.0        <NA>  Alfred       <NA>
1   6.0  1940-04-25  Batman  Batmobile
2  <NA>  1940-04-25              Joker
<BLANKLINE>
[3 rows x 4 columns]
Show which entries in a DataFrame are NA:
>>> df.isna()
    age   born   name    toy
0  False   True  False   True
1  False  False  False  False
2   True  False  False  False
<BLANKLINE>
[3 rows x 4 columns]
>>> df.isnull()
    age   born   name    toy
0  False   True  False   True
1  False  False  False  False
2   True  False  False  False
<BLANKLINE>
[3 rows x 4 columns]
Show which entries in a Series are NA:
>>> ser = bpd.Series([5, None, 6, np.nan, bpd.NA])
>>> ser
0       5
1    <NA>
2       6
3    <NA>
4    <NA>
dtype: Int64
>>> ser.isna()
0    False
1     True
2    False
3     True
4     True
dtype: boolean
>>> ser.isnull()
0    False
1     True
2    False
3     True
4     True
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | Mask of bool values for each element that indicates whether an element is an NA value. | 
isnull
isnull() -> bigframes.series.SeriesDetect missing values.
Return a boolean same-sized object indicating if the values are NA.
NA values get mapped to True values. Everything else gets mapped to
False values. Characters such as empty strings '' or
numpy.inf are not considered NA values.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> import numpy as np
>>> df = bpd.DataFrame(dict(
...         age=[5, 6, np.nan],
...         born=[bpd.NA, "1940-04-25", "1940-04-25"],
...         name=['Alfred', 'Batman', ''],
...         toy=[None, 'Batmobile', 'Joker'],
... ))
>>> df
    age        born    name        toy
0   5.0        <NA>  Alfred       <NA>
1   6.0  1940-04-25  Batman  Batmobile
2  <NA>  1940-04-25              Joker
<BLANKLINE>
[3 rows x 4 columns]
Show which entries in a DataFrame are NA:
>>> df.isna()
    age   born   name    toy
0  False   True  False   True
1  False  False  False  False
2   True  False  False  False
<BLANKLINE>
[3 rows x 4 columns]
>>> df.isnull()
    age   born   name    toy
0  False   True  False   True
1  False  False  False  False
2   True  False  False  False
<BLANKLINE>
[3 rows x 4 columns]
Show which entries in a Series are NA:
>>> ser = bpd.Series([5, None, 6, np.nan, bpd.NA])
>>> ser
0       5
1    <NA>
2       6
3    <NA>
4    <NA>
dtype: Int64
>>> ser.isna()
0    False
1     True
2    False
3     True
4     True
dtype: boolean
>>> ser.isnull()
0    False
1     True
2    False
3     True
4     True
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | Mask of bool values for each element that indicates whether an element is an NA value. | 
items
items()Lazily iterate over (index, value) tuples.
This method returns an iterable tuple (index, value). This is convenient if you want to create a lazy iterator.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(['A', 'B', 'C'])
>>> for index, value in s.items():
...     print(f"Index : {index}, Value : {value}")
Index : 0, Value : A
Index : 1, Value : B
Index : 2, Value : C
| Returns | |
|---|---|
| Type | Description | 
| iterable | Iterable of tuples containing the (index, value) pairs from a Series. | 
kurt
kurt()Return unbiased kurtosis over requested axis.
Kurtosis obtained using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). Normalized by N-1.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 2, 3], index=['cat', 'dog', 'dog', 'mouse'])
>>> s
cat      1
dog      2
dog      2
mouse    3
dtype: Int64
>>> s.kurt()
np.float64(1.5)
With a DataFrame
>>> df = bpd.DataFrame({'a': [1, 2, 2, 3], 'b': [3, 4, 4, 4]},
...                    index=['cat', 'dog', 'dog', 'mouse'])
>>> df
       a  b
cat    1  3
dog    2  4
dog    2  4
mouse  3  4
<BLANKLINE>
[4 rows x 2 columns]
>>> df.kurt()
a    1.5
b    4.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| scalar or scalar | Unbiased kurtosis over requested axis. | 
kurtosis
kurtosis()Return unbiased kurtosis over requested axis.
Kurtosis obtained using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). Normalized by N-1.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 2, 3], index=['cat', 'dog', 'dog', 'mouse'])
>>> s
cat      1
dog      2
dog      2
mouse    3
dtype: Int64
>>> s.kurt()
np.float64(1.5)
With a DataFrame
>>> df = bpd.DataFrame({'a': [1, 2, 2, 3], 'b': [3, 4, 4, 4]},
...                    index=['cat', 'dog', 'dog', 'mouse'])
>>> df
       a  b
cat    1  3
dog    2  4
dog    2  4
mouse  3  4
<BLANKLINE>
[4 rows x 2 columns]
>>> df.kurt()
a    1.5
b    4.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| scalar or scalar | Unbiased kurtosis over requested axis. | 
le
le(other) -> bigframes.series.SeriesGet 'less than or equal to' of Series and other, element-wise (binary operator le).
Equivalent to series <= other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.le(b)
a    True
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the comparison. | 
line
line(
    x: typing.Optional[typing.Hashable] = None,
    y: typing.Optional[typing.Hashable] = None,
    **kwargs
)Plot Series or DataFrame as lines. This function is useful to plot lines using DataFrame's values as coordinates.
This function calls pandas.plot to generate a plot with a random sample
of items. For consistent results, the random sampling is reproducible.
Use the sampling_random_state parameter to modify the sampling seed.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame(
...     {
...         'one': [1, 2, 3, 4],
...         'three': [3, 6, 9, 12],
...         'reverse_ten': [40, 30, 20, 10],
...     }
... )
>>> ax = df.plot.line(x='one')
| Parameters | |
|---|---|
| Name | Description | 
| x | label or position, optionalAllows plotting of one column versus another. If not specified, the index of the DataFrame is used. | 
| y | label or position, optionalAllows plotting of one column versus another. If not specified, all numerical columns are used. | 
| color | str, array-like, or dict, optionalThe color for each of the DataFrame's columns. Possible values are: - A single color string referred to by name, RGB or RGBA code, for instance 'red' or '#a98d19'. - A sequence of color strings referred to by name, RGB or RGBA code, which will be used for each column recursively. For instance ['green','yellow'] each column's %(kind)s will be filled in green or yellow, alternatively. If there is only a single column to be plotted, then only the first color from the color list will be used. - A dict of the form {column name : color}, so that each column will be colored accordingly. For example, if your columns are called  | 
| sampling_n | int, default 100Number of random items for plotting. | 
| sampling_random_state | int, default 0Seed for random number generator. | 
| Returns | |
|---|---|
| Type | Description | 
| matplotlib.axes.Axes or np.ndarray of them | An ndarray is returned with one matplotlib.axes.Axesper column whensubplots=True. | 
lt
lt(other) -> bigframes.series.SeriesGet 'less than' of Series and other, element-wise (binary operator lt).
 Equivalent to series < other, but with support to substitute a
 fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.lt(b)
a    False
b     <NA>
c     <NA>
d     <NA>
e     <NA>
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
map
map(
    arg: typing.Union[typing.Mapping, bigframes.series.Series],
    na_action: typing.Optional[str] = None,
    *,
    verify_integrity: bool = False
) -> bigframes.series.SeriesMap values of Series according to an input mapping or function.
Used for substituting each value in a Series with another value,
that may be derived from a remote function, dict, or a Series.
If arg is a remote function, the overhead for remote functions applies. If mapping with a dict, fully deferred computation is possible. If mapping with a Series, fully deferred computation is only possible if verify_integrity=False.
Examples:>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(['cat', 'dog', bpd.NA, 'rabbit'])
>>> s
0       cat
1       dog
2      <NA>
3    rabbit
dtype: string
map can accepts a dict. Values that are not found in the dict are
converted to NA:
>>> s.map({'cat': 'kitten', 'dog': 'puppy'})
0    kitten
1     puppy
2      <NA>
3      <NA>
dtype: string
It also accepts a remote function:
>>> @bpd.remote_function()
... def my_mapper(val: str) -> str:
...     vowels = ["a", "e", "i", "o", "u"]
...     if val:
...         return "".join([
...             ch.upper() if ch in vowels else ch for ch in val
...         ])
...     return "N/A"
>>> s.map(my_mapper)
0       cAt
1       dOg
2       N/A
3    rAbbIt
dtype: string
| Parameter | |
|---|---|
| Name | Description | 
| arg | function, Mapping, Seriesremote function, collections.abc.Mapping subclass or Series Mapping correspondence. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Same index as caller. | 
mask
mask(cond, other=None) -> bigframes.series.SeriesReplace values where the condition is True.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([10, 11, 12, 13, 14])
>>> s
0    10
1    11
2    12
3    13
4    14
dtype: Int64
You can mask the values in the Series based on a condition. The values matching the condition would be masked. The condition can be provided in formm of a Series.
>>> s.mask(s % 2 == 0)
0    <NA>
1      11
2    <NA>
3      13
4    <NA>
dtype: Int64
You can specify a custom mask value.
>>> s.mask(s % 2 == 0, -1)
0    -1
1    11
2    -1
3    13
4    -1
dtype: Int64
>>> s.mask(s % 2 == 0, 100*s)
0    1000
1      11
2    1200
3      13
4    1400
dtype: Int64
You can also use a remote function to evaluate the mask condition. This is useful in situation such as the following, where the mask condition is evaluated based on a complicated business logic which cannot be expressed in form of a Series.
>>> @bpd.remote_function(reuse=False)
... def should_mask(name: str) -> bool:
...     hash = 0
...     for char_ in name:
...         hash += ord(char_)
...     return hash % 2 == 0
>>> s = bpd.Series(["Alice", "Bob", "Caroline"])
>>> s
0       Alice
1         Bob
2    Caroline
dtype: string
>>> s.mask(should_mask)
0        <NA>
1         Bob
2    Caroline
dtype: string
>>> s.mask(should_mask, "REDACTED")
0    REDACTED
1         Bob
2    Caroline
dtype: string
Simple vectorized (i.e. they only perform operations supported on a Series) lambdas or python functions can be used directly.
>>> nums = bpd.Series([1, 2, 3, 4], name="nums")
>>> nums
0    1
1    2
2    3
3    4
Name: nums, dtype: Int64
>>> nums.mask(lambda x: (x+1) % 2 == 1)
0        1
1     <NA>
2        3
3     <NA>
Name: nums, dtype: Int64
>>> def is_odd(num):
...     return num % 2 == 1
>>> nums.mask(is_odd)
0     <NA>
1        2
2     <NA>
3        4
Name: nums, dtype: Int64
| Parameters | |
|---|---|
| Name | Description | 
| cond | bool Series/DataFrame, array-like, or callableWhere cond is False, keep the original value. Where True, replace with corresponding value from other. If cond is callable, it is computed on the Series/DataFrame and should return boolean Series/DataFrame or array. The callable must not change input Series/DataFrame (though pandas doesn’t check it). | 
| other | scalar, Series/DataFrame, or callableEntries where cond is True are replaced with corresponding value from other. If other is callable, it is computed on the Series/DataFrame and should return scalar or Series/DataFrame. The callable must not change input Series/DataFrame (though pandas doesn’t check it). If not specified, entries will be filled with the corresponding NULL value (np.nan for numpy dtypes, pd.NA for extension dtypes). | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series after the replacement. | 
max
max() -> typing.AnyReturn the maximum of the values over the requested axis.
If you want the index of the maximum, use idxmax. This is the equivalent
of the numpy.ndarray method argmax.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Calculating the max of a Series:
>>> s = bpd.Series([1, 3])
>>> s
0    1
1    3
dtype: Int64
>>> s.max()
np.int64(3)
Calculating the max of a Series containing NA values:
>>> s = bpd.Series([1, 3, bpd.NA])
>>> s
0       1
1       3
2    <NA>
dtype: Int64
>>> s.max()
np.int64(3)
| Returns | |
|---|---|
| Type | Description | 
| scalar | Scalar. | 
mean
mean() -> floatReturn the mean of the values over the requested axis.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Calculating the mean of a Series:
>>> s = bpd.Series([1, 3])
>>> s
0    1
1    3
dtype: Int64
>>> s.mean()
np.float64(2.0)
Calculating the mean of a Series containing NA values:
>>> s = bpd.Series([1, 3, bpd.NA])
>>> s
0       1
1       3
2    <NA>
dtype: Int64
>>> s.mean()
np.float64(2.0)
| Returns | |
|---|---|
| Type | Description | 
| scalar | Scalar. | 
median
median(*, exact: bool = True) -> floatReturn the median of the values over the requested axis.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3])
>>> s.median()
np.float64(2.0)
With a DataFrame
>>> df = bpd.DataFrame({'a': [1, 2], 'b': [2, 3]}, index=['tiger', 'zebra'])
>>> df
       a  b
tiger  1  2
zebra  2  3
<BLANKLINE>
[2 rows x 2 columns]
>>> df.median()
a    1.5
b    2.5
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| exact | bool. default TrueDefault True. Get the exact median instead of an approximate one. | 
| Returns | |
|---|---|
| Type | Description | 
| scalar | Scalar. | 
min
min() -> typing.AnyReturn the maximum of the values over the requested axis.
If you want the index of the minimum, use idxmin. This is the equivalent
of the numpy.ndarray method argmin.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Calculating the min of a Series:
>>> s = bpd.Series([1, 3])
>>> s
0    1
1    3
dtype: Int64
>>> s.min()
np.int64(1)
Calculating the min of a Series containing NA values:
>>> s = bpd.Series([1, 3, bpd.NA])
>>> s
0       1
1       3
2    <NA>
dtype: Int64
>>> s.min()
np.int64(1)
| Returns | |
|---|---|
| Type | Description | 
| scalar | Scalar. | 
mod
mod(other) -> bigframes.series.SeriesReturn modulo of Series and other, element-wise (binary operator mod).
Equivalent to series % other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.mod(b)
a     0.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
mode
mode() -> bigframes.series.SeriesReturn the mode(s) of the Series.
The mode is the value that appears most often. There can be multiple modes.
Always returns Series even if only one value is returned.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([2, 4, 8, 2, 4, None])
>>> s.mode()
0    2.0
1    4.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Modes of the Series in sorted order. | 
mul
mul(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn multiplication of Series and other, element-wise (binary operator mul).
Equivalent to other * series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.multiply(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
multiply
multiply(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn multiplication of Series and other, element-wise (binary operator mul).
Equivalent to other * series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.multiply(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
ne
ne(other: object) -> bigframes.series.SeriesReturn not equal of Series and other, element-wise (binary operator ne).
Equivalent to other != series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.ne(b)
a    False
b     <NA>
c     <NA>
d     <NA>
e     <NA>
dtype: boolean
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
nlargest
nlargest(n: int = 5, keep: str = "first") -> bigframes.series.SeriesReturn the largest n elements.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> countries_population = {"Italy": 59000000, "France": 65000000,
...                          "Malta": 434000, "Maldives": 434000,
...                          "Brunei": 434000, "Iceland": 337000,
...                          "Nauru": 11300, "Tuvalu": 11300,
...                          "Anguilla": 11300, "Montserrat": 5200}
>>> s = bpd.Series(countries_population)
>>> s
Italy         59000000
France        65000000
Malta           434000
Maldives        434000
Brunei          434000
Iceland         337000
Nauru            11300
Tuvalu           11300
Anguilla         11300
Montserrat        5200
dtype: Int64
The n largest elements where n=5 by default.
>>> s.nlargest()
France      65000000
Italy       59000000
Malta         434000
Maldives      434000
Brunei        434000
dtype: Int64
The n largest elements where n=3. Default keep value is first so Malta
  will be kept.
>>> s.nlargest(3)
France    65000000
Italy     59000000
Malta       434000
dtype: Int64
The n largest elements where n=3 and keeping the last duplicates. Brunei
will be kept since it is the last with value 434000 based on the index order.
>>> s.nlargest(3, keep='last')
France    65000000
Italy     59000000
Brunei      434000
dtype: Int64
The n largest elements where n=3 with all duplicates kept. Note that the
returned Series has five elements due to the three duplicates.
>>> s.nlargest(3, keep='all')
France      65000000
Italy       59000000
Malta         434000
Maldives      434000
Brunei        434000
dtype: Int64
| Parameters | |
|---|---|
| Name | Description | 
| n | int, default 5Return this many descending sorted values. | 
| keep | {'first', 'last', 'all'}, default 'first'When there are duplicate values that cannot all fit in a Series of  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The nlargest values in the Series, sorted in decreasing order. | 
notna
notna() -> bigframes.series.SeriesDetect existing (non-missing) values.
Return a boolean same-sized object indicating if the values are not NA.
Non-missing values get mapped to True. Characters such as empty
strings '' or numpy.inf are not considered NA values.
NA values get mapped to False values.
| Returns | |
|---|---|
| Type | Description | 
| NDFrame | Mask of bool values for each element that indicates whether an element is not an NA value. | 
notnull
notnull() -> bigframes.series.SeriesDetect existing (non-missing) values.
Return a boolean same-sized object indicating if the values are not NA.
Non-missing values get mapped to True. Characters such as empty
strings '' or numpy.inf are not considered NA values.
NA values get mapped to False values.
| Returns | |
|---|---|
| Type | Description | 
| NDFrame | Mask of bool values for each element that indicates whether an element is not an NA value. | 
nsmallest
nsmallest(n: int = 5, keep: str = "first") -> bigframes.series.SeriesReturn the smallest n elements.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> countries_population = {"Italy": 59000000, "France": 65000000,
...                          "Malta": 434000, "Maldives": 434000,
...                          "Brunei": 434000, "Iceland": 337000,
...                          "Nauru": 11300, "Tuvalu": 11300,
...                          "Anguilla": 11300, "Montserrat": 5200}
>>> s = bpd.Series(countries_population)
>>> s
Italy         59000000
France        65000000
Malta           434000
Maldives        434000
Brunei          434000
Iceland         337000
Nauru            11300
Tuvalu           11300
Anguilla         11300
Montserrat        5200
dtype: Int64
The n smallest elements where n=5 by default.
>>> s.nsmallest()
Montserrat      5200
Nauru          11300
Tuvalu         11300
Anguilla       11300
Iceland       337000
dtype: Int64
The n smallest elements where n=3. Default keep value is first so
Nauru and Tuvalu will be kept.
>>> s.nsmallest(3)
Montserrat     5200
Nauru         11300
Tuvalu        11300
dtype: Int64
The n smallest elements where n=3 with all duplicates kept. Note that
the returned Series has four elements due to the three duplicates.
>>> s.nsmallest(3, keep='all')
Montserrat     5200
Nauru         11300
Tuvalu        11300
Anguilla      11300
dtype: Int64
| Parameters | |
|---|---|
| Name | Description | 
| n | int, default 5Return this many ascending sorted values. | 
| keep | {'first', 'last', 'all'}, default 'first'When there are duplicate values that cannot all fit in a Series of  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The nsmallest values in the Series, sorted in increasing order. | 
nunique
nunique() -> intReturn number of unique elements in the object.
Excludes NA values by default.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 3, 5, 7, 7])
>>> s
0    1
1    3
2    5
3    7
4    7
dtype: Int64
>>> s.nunique()
np.int64(4)
| Returns | |
|---|---|
| Type | Description | 
| int | number of unique elements in the object. | 
pad
pad(*, limit: typing.Optional[int] = None) -> bigframes.series.SeriesFill NA/NaN values by propagating the last valid observation to next valid.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([[np.nan, 2, np.nan, 0],
...                     [3, 4, np.nan, 1],
...                     [np.nan, np.nan, np.nan, np.nan],
...                     [np.nan, 3, np.nan, 4]],
...                    columns=list("ABCD")).astype("Float64")
>>> df
      A     B     C     D
0  <NA>   2.0  <NA>   0.0
1   3.0   4.0  <NA>   1.0
2  <NA>  <NA>  <NA>  <NA>
3  <NA>   3.0  <NA>   4.0
<BLANKLINE>
[4 rows x 4 columns]
Fill NA/NaN values in DataFrames:
>>> df.ffill()
      A    B     C    D
0  <NA>  2.0  <NA>  0.0
1   3.0  4.0  <NA>  1.0
2   3.0  4.0  <NA>  1.0
3   3.0  3.0  <NA>  4.0
<BLANKLINE>
[4 rows x 4 columns]
Fill NA/NaN values in Series:
>>> series = bpd.Series([1, np.nan, 2, 3])
>>> series.ffill()
0    1.0
1    1.0
2    2.0
3    3.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series or None | Object with missing values filled. | 
pct_change
pct_change(periods: int = 1) -> bigframes.series.SeriesFractional change between the current and a prior element.
Computes the fractional change from the immediately previous row by default. This is useful in comparing the fraction of change in a time series of elements.
| Parameter | |
|---|---|
| Name | Description | 
| periods | int, default 1Periods to shift for forming percent change. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | The same type as the calling object. | 
peek
peek(n: int = 5, *, force: bool = True) -> pandas.core.series.SeriesPreview n arbitrary elements from the series without guarantees about row selection or ordering.
Series.peek(force=False) will always be very fast, but will not succeed if data requires
full data scanning. Using force=True will always succeed, but may be perform queries.
Query results will be cached so that future steps will benefit from these queries.
| Parameters | |
|---|---|
| Name | Description | 
| n | int, default 5The number of rows to select from the series. Which N rows are returned is non-deterministic. | 
| force | bool, default TrueIf the data cannot be peeked efficiently, the series will instead be fully materialized as part of the operation if  | 
| Exceptions | |
|---|---|
| Type | Description | 
| ValueError | If force=False and data cannot be efficiently peeked. | 
| Returns | |
|---|---|
| Type | Description | 
| pandas.Series | A pandas Series with n rows. | 
pow
pow(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn Exponential power of series and other, element-wise (binary
operator pow).
Equivalent to series ** other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.pow(b)
a     1.0
b     1.0
c     1.0
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
prod
prod() -> floatReturn the product of the values over the requested axis.
| Returns | |
|---|---|
| Type | Description | 
| scalar | Scalar. | 
product
product() -> floatReturn the product of the values over the requested axis.
| Returns | |
|---|---|
| Type | Description | 
| scalar | Scalar. | 
quantile
quantile(
    q: typing.Union[float, typing.Sequence[float]] = 0.5
) -> typing.Union[bigframes.series.Series, float]Return value at the given quantile.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3, 4])
>>> s.quantile(.5)
np.float64(2.5)
>>> s.quantile([.25, .5, .75])
0.25    1.75
0.5      2.5
0.75    3.25
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| q | Union[float, Sequence[float], default 0.5 (50% quantile)The quantile(s) to compute, which can lie in range: 0 <= q <= 1. | 
| Returns | |
|---|---|
| Type | Description | 
| Union[float, bigframes.pandas.Series] | If qis an array, a Series will be returned where the index isqand the values are the quantiles, otherwise a float will be returned. | 
radd
radd(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn addition of Series and other, element-wise (binary operator radd).
Equivalent to other + series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.add(b)
a     2.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
rank
rank(
    axis=0,
    method: str = "average",
    numeric_only=False,
    na_option: str = "keep",
    ascending: bool = True,
) -> bigframes.series.SeriesCompute numerical data ranks (1 through n) along axis.
By default, equal values are assigned a rank that is the average of the ranks of those values.
| Parameters | |
|---|---|
| Name | Description | 
| method | {'average', 'min', 'max', 'first', 'dense'}, default 'average'How to rank the group of records that have the same value (i.e. ties):  | 
| numeric_only | bool, default FalseFor DataFrame objects, rank only numeric columns if set to True. | 
| na_option | {'keep', 'top', 'bottom'}, default 'keep'How to rank NaN values:  | 
| ascending | bool, default TrueWhether or not the elements should be ranked in ascending order. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | Return a Series or DataFrame with data ranks as values. | 
rdiv
rdiv(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn floating division of Series and other, element-wise (binary operator rtruediv).
Equivalent to other / series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.divide(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
rdivmod
rdivmod(other) -> typing.Tuple[bigframes.series.Series, bigframes.series.Series]Return integer division and modulo of Series and other, element-wise (binary operator rdivmod).
Equivalent to other divmod series.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.divmod(b)
(a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64,
a     0.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64)
| Returns | |
|---|---|
| Type | Description | 
| Tuple[bigframes.pandas.Series, bigframes.pandas.Series] | The result of the operation. The result is always consistent with (rfloordiv, rmod) (though pandas may not). | 
reindex
reindex(index=None, *, validate: typing.Optional[bool] = None)Conform Series to new index with optional filling logic.
Places NA/NaN in locations having no value in the previous index. A new object
is produced unless the new index is equivalent to the current one and
copy=False.
| Parameter | |
|---|---|
| Name | Description | 
| index | array-like, optionalNew labels for the index. Preferably an Index object to avoid duplicating data. | 
| Returns | |
|---|---|
| Type | Description | 
| Series | Series with changed index. | 
reindex_like
reindex_like(
    other: bigframes.series.Series, *, validate: typing.Optional[bool] = None
)Return an object with matching indices as other object.
Conform the object to the same index on all axes. Optional filling logic, placing Null in locations having no value in the previous index.
| Parameter | |
|---|---|
| Name | Description | 
| other | Object of the same data typeIts row and column indices are used to define the new indices of this object. | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Same type as caller, but with changed indices on each axis. | 
rename
rename(
    index: typing.Union[typing.Hashable, typing.Mapping[typing.Any, typing.Any]] = None,
    **kwargs
) -> bigframes.series.SeriesAlter Series index labels or name.
Function / dict values must be unique (1-to-1). Labels not contained in a dict / Series will be left as-is. Extra labels listed don't throw an error.
Alternatively, change Series.name with a scalar value.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3])
>>> s
0    1
1    2
2    3
dtype: Int64
You can changes the Series name by specifying a string scalar:
>>> s.rename("my_name")
0    1
1    2
2    3
Name: my_name, dtype: Int64
You can change the labels by specifying a mapping:
>>> s.rename({1: 3, 2: 5})
0    1
3    2
5    3
dtype: Int64
| Parameter | |
|---|---|
| Name | Description | 
| index | scalar, hashable sequence, dict-like or function optionalFunctions or dict-like are transformations to apply to the index. Scalar or hashable sequence-like will alter the  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series with index labels. | 
rename_axis
rename_axis(
    mapper: typing.Union[typing.Hashable, typing.Sequence[typing.Hashable]], **kwargs
) -> bigframes.series.SeriesSet the name of the axis for the index or columns.
| Parameter | |
|---|---|
| Name | Description | 
| mapper | scalar, list-like, optional **Examples:** >>> import bigframes.pandas as bpd >>> bpd.options.display.progress_bar = None Series >>> s = bpd.Series(["dog", "cat", "monkey"]) >>> s 0 dog 1 cat 2 monkey dtype: string >>> s.rename_axis("animal") animal 0 dog 1 cat 2 monkey dtype: string DataFrame >>> df = bpd.DataFrame({"num_legs": [4, 4, 2], ... "num_arms": [0, 0, 2]}, ... ["dog", "cat", "monkey"]) >>> df num_legs num_arms dog 4 0 cat 4 0 monkey 2 2 Value to set the axis name attribute. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series or bigframes.pandas.DataFrame | The same type as the caller. | 
reorder_levels
reorder_levels(
    order: typing.Union[str, int, typing.Sequence[typing.Union[str, int]]],
    axis: int | str = 0,
)Rearrange index levels using input order.
May not drop or duplicate levels.
| Parameters | |
|---|---|
| Name | Description | 
| order | list of int representing new level orderReference level by number or key. | 
| axis | {0 or 'index', 1 or 'columns'}, default 0For  | 
replace
replace(to_replace: typing.Any, value: typing.Any = None, *, regex: bool = False)Replace values given in to_replace with value.
Values of the Series/DataFrame are replaced with other values dynamically.
This differs from updating with .loc or .iloc, which require
you to specify a location to update with some value.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3, 4, 5])
>>> s
0    1
1    2
2    3
3    4
4    5
dtype: Int64
>>> s.replace(1, 5)
0    5
1    2
2    3
3    4
4    5
dtype: Int64
You can replace a list of values:
>>> s.replace([1, 3, 5], -1)
0    -1
1     2
2    -1
3     4
4    -1
dtype: Int64
You can use a replacement mapping:
>>> s.replace({1: 5, 3: 10})
0     5
1     2
2    10
3     4
4     5
dtype: Int64
With a string Series you can use a simple string replacement or a regex replacement:
>>> s = bpd.Series(["Hello", "Another Hello"])
>>> s.replace("Hello", "Hi")
0               Hi
1    Another Hello
dtype: string
>>> s.replace("Hello", "Hi", regex=True)
0            Hi
1    Another Hi
dtype: string
>>> s.replace("^Hello", "Hi", regex=True)
0               Hi
1    Another Hello
dtype: string
>>> s.replace("Hello$", "Hi", regex=True)
0            Hi
1    Another Hi
dtype: string
>>> s.replace("[Hh]e", "__", regex=True)
0            __llo
1    Anot__r __llo
dtype: string
| Parameters | |
|---|---|
| Name | Description | 
| to_replace | str, regex, list, int, float or NoneHow to find the values that will be replaced. * numeric, str or regex: - numeric: numeric values equal to  | 
| value | scalar, default NoneValue to replace any values matching  | 
| regex | bool, default FalseWhether to interpret  | 
| Exceptions | |
|---|---|
| Type | Description | 
| TypeError | * If to_replaceis not a scalar, array-like,dict, orNone* Ifto_replaceis adictandvalueis not alist,dict,ndarray, orSeries* Ifto_replaceisNoneandregexis not compilable into a regular expression or is a list, dict, ndarray, or Series. * When replacing multipleboolordatetime64objects and the arguments toto_replacedoes not match the type of the value being replaced | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series or bigframes.pandas.DataFrame | Object after replacement. | 
reset_index
reset_index(
    *, name: typing.Optional[str] = None, drop: bool = False
) -> bigframes.dataframe.DataFrame | bigframes.series.SeriesGenerate a new DataFrame or Series with the index reset.
This is useful when the index needs to be treated as a column, or when the index is meaningless and needs to be reset to the default before another operation.
Examples:
>>> import bigframes.pandas as bpd
>>> import pandas as pd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3, 4], name='foo',
...                index=['a', 'b', 'c', 'd'])
>>> s.index.name = "idx"
>>> s
idx
a    1
b    2
c    3
d    4
Name: foo, dtype: Int64
Generate a DataFrame with default index.
>>> s.reset_index()
    idx  foo
0     a    1
1     b    2
2     c    3
3     d    4
<BLANKLINE>
[4 rows x 2 columns]
To specify the name of the new column use name param.
>>> s.reset_index(name="bar")
    idx   bar
0     a    1
1     b    2
2     c    3
3     d    4
<BLANKLINE>
[4 rows x 2 columns]
To generate a new Series with the default index set param drop=True.
>>> s.reset_index(drop=True)
0    1
1    2
2    3
3    4
Name: foo, dtype: Int64
>>> arrays = [np.array(['bar', 'bar', 'baz', 'baz']),
...           np.array(['one', 'two', 'one', 'two'])]
>>> s2 = bpd.Series(
...     range(4), name='foo',
...     index=pd.MultiIndex.from_arrays(arrays,
...                                     names=['a', 'b']))
If level is not set, all levels are removed from the Index.
>>> s2.reset_index()
     a    b  foo
0  bar  one    0
1  bar  two    1
2  baz  one    2
3  baz  two    3
<BLANKLINE>
[4 rows x 3 columns]
| Parameters | |
|---|---|
| Name | Description | 
| drop | bool, default FalseJust reset the index, without inserting it as a column in the new DataFrame. | 
| name | object, optionalThe name to use for the column containing the original Series values. Uses  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series or bigframes.pandas.DataFrame or None | When dropis False (the default), a DataFrame is returned. The newly created columns will come first in the DataFrame, followed by the original Series values. Whendropis True, aSeriesis returned. In either case, ifinplace=True, no value is returned. | 
rfloordiv
rfloordiv(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn integer division of Series and other, element-wise (binary operator rfloordiv).
Equivalent to other // series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.floordiv(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
rmod
rmod(other) -> bigframes.series.SeriesReturn modulo of Series and other, element-wise (binary operator mod).
Equivalent to series % other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.mod(b)
a     0.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
rmul
rmul(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn multiplication of Series and other, element-wise (binary operator mul).
Equivalent to series * others, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.multiply(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
rolling
rolling(window: int, min_periods=None) -> bigframes.core.window.WindowProvide rolling window calculations.
| Parameters | |
|---|---|
| Name | Description | 
| window | int, timedelta, str, offset, or BaseIndexer subclassSize of the moving window. If an integer, the fixed number of observations used for each window. If a timedelta, str, or offset, the time period of each window. Each window will be a variable sized based on the observations included in the time-period. This is only valid for datetime-like indexes. To learn more about the offsets & frequency strings, please see  | 
| min_periods | int, default NoneMinimum number of observations in window required to have a value; otherwise, result is  | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.core.window.Window | Windowsubclass if awin_typeis passed.Rollingsubclass ifwin_typeis not passed. | 
round
round(decimals=0) -> bigframes.series.SeriesRound each value in a Series to the given number of decimals.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([0.1, 1.3, 2.7])
>>> s.round()
0    0.0
1    1.0
2    3.0
dtype: Float64
>>> s = bpd.Series([0.123, 1.345, 2.789])
>>> s.round(decimals=2)
0    0.12
1    1.34
2    2.79
dtype: Float64
| Parameter | |
|---|---|
| Name | Description | 
| decimals | int, default 0Number of decimal places to round to. If decimals is negative, it specifies the number of positions to the left of the decimal point. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Rounded values of the Series. | 
rpow
rpow(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn Exponential power of series and other, element-wise (binary
operator rpow).
Equivalent to other ** series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.pow(b)
a     1.0
b     1.0
c     1.0
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
rsub
rsub(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn subtraction of Series and other, element-wise (binary operator rsub).
Equivalent to other - series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.subtract(b)
a     0.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
rtruediv
rtruediv(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn floating division of Series and other, element-wise (binary operator rtruediv).
Equivalent to other / series, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.divide(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
sample
sample(
    n: typing.Optional[int] = None,
    frac: typing.Optional[float] = None,
    *,
    random_state: typing.Optional[int] = None,
    sort: typing.Optional[typing.Union[bool, typing.Literal["random"]]] = "random"
) -> bigframes.series.SeriesReturn a random sample of items from an axis of object.
You can use random_state for reproducibility.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'num_legs': [2, 4, 8, 0],
...                     'num_wings': [2, 0, 0, 0],
...                     'num_specimen_seen': [10, 2, 1, 8]},
...                    index=['falcon', 'dog', 'spider', 'fish'])
>>> df
        num_legs  num_wings  num_specimen_seen
falcon         2          2                 10
dog            4          0                  2
spider         8          0                  1
fish           0          0                  8
<BLANKLINE>
[4 rows x 3 columns]
Fetch one random row from the DataFrame (Note that we use random_state
to ensure reproducibility of the examples):
>>> df.sample(random_state=1)
     num_legs  num_wings  num_specimen_seen
dog         4          0                  2
<BLANKLINE>
[1 rows x 3 columns]
A random 50% sample of the DataFrame:
>>> df.sample(frac=0.5, random_state=1)
      num_legs  num_wings  num_specimen_seen
dog          4          0                  2
fish         0          0                  8
<BLANKLINE>
[2 rows x 3 columns]
Extract 3 random elements from the Series df['num_legs']:
>>> s = df['num_legs']
>>> s.sample(n=3, random_state=1)
dog       4
fish      0
spider    8
Name: num_legs, dtype: Int64
| Parameters | |
|---|---|
| Name | Description | 
| n | Optional[int], default NoneNumber of items from axis to return. Cannot be used with  | 
| frac | Optional[float], default NoneFraction of axis items to return. Cannot be used with  | 
| random_state | Optional[int], default NoneSeed for random number generator. | 
| sort | Optional[bool|Literal["random"]], default "random"
 | 
| Exceptions | |
|---|---|
| Type | Description | 
| ValueError | If both nandfracare specified. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame or bigframes.pandas.Series | A new object of same type as caller containing nitems randomly sampled from the caller object. | 
shift
shift(periods: int = 1) -> bigframes.series.SeriesShift index by desired number of periods.
Shifts the index without realigning the data.
| Returns | |
|---|---|
| Type | Description | 
| NDFrame | Copy of input object, shifted. | 
skew
skew()Return unbiased skew over requested axis.
Normalized by N-1.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3])
>>> s.skew()
np.float64(0.0)
With a DataFrame
>>> df = bpd.DataFrame({'a': [1, 2, 3], 'b': [2, 3, 4], 'c': [1, 3, 5]},
...                    index=['tiger', 'zebra', 'cow'])
>>> df
        a   b   c
tiger   1   2   1
zebra   2   3   3
cow     3   4   5
<BLANKLINE>
[3 rows x 3 columns]
>>> df.skew()
a   0.0
b   0.0
c   0.0
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| scalar | Scalar. | 
sort_index
sort_index(
    *, axis=0, ascending=True, na_position="last"
) -> bigframes.series.SeriesSort Series by index labels.
Returns a new Series sorted by label if inplace argument is
False, otherwise updates the original series and returns None.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(['a', 'b', 'c', 'd'], index=[3, 2, 1, 4])
>>> s.sort_index()
1    c
2    b
3    a
4    d
dtype: string
Sort Descending
>>> s.sort_index(ascending=False)
4    d
3    a
2    b
1    c
dtype: string
By default NaNs are put at the end, but use na_position to place them at the beginning
>>> s = bpd.Series(['a', 'b', 'c', 'd'], index=[3, 2, 1, np.nan])
>>> s.sort_index(na_position='first')
<NA>    d
1.0     c
2.0     b
3.0     a
dtype: string
| Parameters | |
|---|---|
| Name | Description | 
| axis | {0 or 'index'}Unused. Parameter needed for compatibility with DataFrame. | 
| ascending | bool or list-like of bools, default TrueSort ascending vs. descending. When the index is a MultiIndex the sort direction can be controlled for each level individually. | 
| na_position | {'first', 'last'}, default 'last'If 'first' puts NaNs at the beginning, 'last' puts NaNs at the end. Not implemented for MultiIndex. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series or None | The original Series sorted by the labels or None if inplace=True. | 
sort_values
sort_values(
    *, axis=0, ascending=True, kind: str = "quicksort", na_position="last"
) -> bigframes.series.SeriesSort by the values.
Sort a Series in ascending or descending order by some criterion.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([np.nan, 1, 3, 10, 5])
>>> s
0    <NA>
1     1.0
2     3.0
3    10.0
4     5.0
dtype: Float64
Sort values ascending order (default behaviour):
>>> s.sort_values(ascending=True)
1     1.0
2     3.0
4     5.0
3    10.0
0    <NA>
dtype: Float64
Sort values descending order:
>>> s.sort_values(ascending=False)
3    10.0
4     5.0
2     3.0
1     1.0
0    <NA>
dtype: Float64
Sort values putting NAs first:
>>> s.sort_values(na_position='first')
0    <NA>
1     1.0
2     3.0
4     5.0
3    10.0
dtype: Float64
Sort a series of strings:
>>> s = bpd.Series(['z', 'b', 'd', 'a', 'c'])
>>> s
0    z
1    b
2    d
3    a
4    c
dtype: string
>>> s.sort_values()
3    a
1    b
4    c
2    d
0    z
dtype: string
| Parameters | |
|---|---|
| Name | Description | 
| axis | 0 or 'index'Unused. Parameter needed for compatibility with DataFrame. | 
| ascending | bool or list of bools, default TrueIf True, sort values in ascending order, otherwise descending. | 
| kind | str, default to 'quicksort'Choice of sorting algorithm. Accepts quicksort', 'mergesort', 'heapsort', 'stable'. Ignored except when determining whether to sort stably. 'mergesort' or 'stable' will result in stable reorder | 
| na_position | {'first' or 'last'}, default 'last'Argument 'first' puts NaNs at the beginning, 'last' puts NaNs at the end. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series or None | Series ordered by values or None if inplace=True. | 
std
std() -> floatReturn sample standard deviation over requested axis.
Normalized by N-1 by default.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame({'person_id': [0, 1, 2, 3],
...                     'age': [21, 25, 62, 43],
...                     'height': [1.61, 1.87, 1.49, 2.01]}
...                   ).set_index('person_id')
>>> df
           age  height
person_id
0           21    1.61
1           25    1.87
2           62    1.49
3           43    2.01
<BLANKLINE>
[4 rows x 2 columns]
>>> df.std()
age       18.786076
height     0.237417
dtype: Float64
sub
sub(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn subtraction of Series and other, element-wise (binary operator sub).
Equivalent to series - other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.subtract(b)
a     0.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
subtract
subtract(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn subtraction of Series and other, element-wise (binary operator sub).
Equivalent to series - other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.subtract(b)
a     0.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
sum
sum() -> floatReturn the sum of the values over the requested axis.
This is equivalent to the method numpy.sum.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
Calculating the sum of a Series:
>>> s = bpd.Series([1, 3])
>>> s
0    1
1    3
dtype: Int64
>>> s.sum()
np.int64(4)
Calculating the sum of a Series containing NA values:
>>> s = bpd.Series([1, 3, bpd.NA])
>>> s
0       1
1       3
2    <NA>
dtype: Int64
>>> s.sum()
np.int64(4)
| Returns | |
|---|---|
| Type | Description | 
| scalar | Scalar. | 
swaplevel
swaplevel(i: int = -2, j: int = -1)Swap levels i and j in a MultiIndex.
Default is to swap the two innermost levels of the index.
| Parameters | |
|---|---|
| Name | Description | 
| i | int or strLevels of the indices to be swapped. Can pass level name as string. | 
| j | int or strLevels of the indices to be swapped. Can pass level name as string. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series with levels swapped in MultiIndex | 
tail
tail(n: int = 5) -> bigframes.series.SeriesReturn the last n rows.
This function returns last n rows from the object based on
position. It is useful for quickly verifying data, for example,
after sorting or appending rows.
For negative values of n, this function returns all rows except
the first |n| rows, equivalent to df[|n|:].
If n is larger than the number of rows, this function returns all rows.
| Parameter | |
|---|---|
| Name | Description | 
| n | int, default 5Number of rows to select. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame | The last nrows of the caller object. | 
to_csv
to_csv(
    path_or_buf=None, sep=",", *, header: bool = True, index: bool = True
) -> typing.Optional[str]Write object to a comma-separated values (csv) file on Cloud Storage.
| Parameters | |
|---|---|
| Name | Description | 
| path_or_buf | str, path object, file-like object, or None, default NoneString, path object (implementing os.PathLike[str]), or file-like object implementing a write() function. If None, the result is returned as a string. If a non-binary file object is passed, it should be opened with  | 
| index | bool, default TrueIf True, write row names (index). | 
| Returns | |
|---|---|
| Type | Description | 
| None or str | If path_or_buf is None, returns the resulting json format as a string. Otherwise returns None. | 
to_dict
to_dict(into: type[dict] = <class 'dict'>) -> typing.MappingConvert Series to {label -> value} dict or dict-like object.
Examples:
>>> import bigframes.pandas as bpd
>>> from collections import OrderedDict, defaultdict
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3, 4])
>>> s.to_dict()
{np.int64(0): 1, np.int64(1): 2, np.int64(2): 3, np.int64(3): 4}
>>> s.to_dict(into=OrderedDict)
OrderedDict({np.int64(0): 1, np.int64(1): 2, np.int64(2): 3, np.int64(3): 4})
>>> dd = defaultdict(list)
>>> s.to_dict(into=dd)
defaultdict(<class 'list'>, {np.int64(0): 1, np.int64(1): 2, np.int64(2): 3, np.int64(3): 4})
| Parameter | |
|---|---|
| Name | Description | 
| into | class, default dictThe collections.abc.Mapping subclass to use as the return object. Can be the actual class or an empty instance of the mapping type you want. If you want a collections.defaultdict, you must pass it initialized. | 
| Returns | |
|---|---|
| Type | Description | 
| collections.abc.Mapping | Key-value representation of Series. | 
to_excel
to_excel(excel_writer, sheet_name="Sheet1", **kwargs) -> NoneWrite Series to an Excel sheet.
To write a single Series to an Excel .xlsx file it is only necessary to
specify a target file name. To write to multiple sheets it is necessary to
create an ExcelWriter object with a target file name, and specify a sheet
in the file to write to.
Multiple sheets may be written to by specifying unique sheet_name.
With all data written to the file it is necessary to save the changes.
Note that creating an ExcelWriter object with a file name that already
exists will result in the contents of the existing file being erased.
| Parameters | |
|---|---|
| Name | Description | 
| excel_writer | path-like, file-like, or ExcelWriter objectFile path or existing ExcelWriter. | 
| sheet_name | str, default 'Sheet1'Name of sheet to contain Series. | 
to_frame
to_frame(name: typing.Hashable = None) -> bigframes.dataframe.DataFrameConvert Series to DataFrame.
The column in the new dataframe will be named name (the keyword parameter) if the name parameter is provided and not None.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(["a", "b", "c"],
...                name="vals")
>>> s.to_frame()
  vals
0    a
1    b
2    c
<BLANKLINE>
[3 rows x 1 columns]
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame | DataFrame representation of Series. | 
to_json
to_json(
    path_or_buf=None,
    orient: typing.Optional[
        typing.Literal["split", "records", "index", "columns", "values", "table"]
    ] = None,
    *,
    lines: bool = False,
    index: bool = True
) -> typing.Optional[str]Convert the object to a JSON string, written to Cloud Storage.
Note NaN's and None will be converted to null and datetime objects will be converted to UNIX timestamps.
| Parameters | |
|---|---|
| Name | Description | 
| path_or_buf | str, path object, file-like object, or None, default NoneString, path object (implementing os.PathLike[str]), or file-like object implementing a write() function. If None, the result is returned as a string. Can be a destination URI of Cloud Storage files(s) to store the extracted dataframe in format of  | 
| orient | {Indication of expected JSON string format. * Series: - default is 'index' - allowed values are: {{'split', 'records', 'index', 'table'}}. * DataFrame: - default is 'columns' - allowed values are: {{'split', 'records', 'index', 'columns', 'values', 'table'}}. * The format of the JSON string: - 'split' : dict like {{'index' -> [index], 'columns' -> [columns], 'data' -> [values]}} - 'records' : list like [{{column -> value}}, ... , {{column -> value}}] - 'index' : dict like {{index -> {{column -> value}}}} - 'columns' : dict like {{column -> {{index -> value}}}} - 'values' : just the values array - 'table' : dict like {{'schema': {{schema}}, 'data': {{data}}}} Describing the data, where data component is like  | 
| index | bool, default TrueIf True, write row names (index). | 
| lines | bool, default FalseIf 'orient' is 'records' write out line-delimited json format. Will throw ValueError if incorrect 'orient' since others are not list-like. | 
| Exceptions | |
|---|---|
| Type | Description | 
| ValueError | If linesis True butrecordsis not provided as value fororient. | 
| Returns | |
|---|---|
| Type | Description | 
| None or str | If path_or_buf is None, returns the resulting json format as a string. Otherwise returns None. | 
to_latex
to_latex(
    buf=None, columns=None, header=True, index=True, **kwargs
) -> typing.Optional[str]Render object to a LaTeX tabular, longtable, or nested table.
| Parameters | |
|---|---|
| Name | Description | 
| buf | str, Path or StringIO-like, optional, default NoneBuffer to write to. If None, the output is returned as a string. | 
| columns | list of label, optionalThe subset of columns to write. Writes all columns by default. | 
| header | bool or list of str, default TrueWrite out the column names. If a list of strings is given, it is assumed to be aliases for the column names. | 
| index | bool, default TrueWrite row names (index). | 
| Returns | |
|---|---|
| Type | Description | 
| str or None | If buf is None, returns the result as a string. Otherwise returns None. | 
to_list
to_list() -> listReturn a list of the values.
These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3])
>>> s
0    1
1    2
2    3
dtype: Int64
>>> s.to_list()
[1, 2, 3]
| Returns | |
|---|---|
| Type | Description | 
| list | list of the values. | 
to_markdown
to_markdown(
    buf: typing.Optional[typing.IO[str]] = None,
    mode: str = "wt",
    index: bool = True,
    **kwargs
) -> typing.Optional[str]Print Series in Markdown-friendly format.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(["elk", "pig", "dog", "quetzal"], name="animal")
>>> print(s.to_markdown())
|    | animal   |
|---:|:---------|
|  0 | elk      |
|  1 | pig      |
|  2 | dog      |
|  3 | quetzal  |
Output markdown with a tabulate option.
>>> print(s.to_markdown(tablefmt="grid"))
+----+----------+
|    | animal   |
+====+==========+
|  0 | elk      |
+----+----------+
|  1 | pig      |
+----+----------+
|  2 | dog      |
+----+----------+
|  3 | quetzal  |
+----+----------+
| Parameters | |
|---|---|
| Name | Description | 
| buf | str, Path or StringIO-like, optional, default NoneBuffer to write to. If None, the output is returned as a string. | 
| mode | str, optionalMode in which file is opened, "wt" by default. | 
| index | bool, optional, default TrueAdd index (row) labels. | 
| Returns | |
|---|---|
| Type | Description | 
| str | Series in Markdown-friendly format. | 
to_numpy
to_numpy(dtype=None, copy=False, na_value=None, **kwargs) -> numpy.ndarrayA NumPy ndarray representing the values in this Series or Index.
Examples:
>>> import bigframes.pandas as bpd
>>> import pandas as pd
>>> bpd.options.display.progress_bar = None
>>> ser = bpd.Series(pd.Categorical(['a', 'b', 'a']))
>>> ser.to_numpy()
array(['a', 'b', 'a'], dtype=object)
Specify the dtype to control how datetime-aware data is represented. Use dtype=object to return an ndarray of pandas Timestamp objects, each with the correct tz.
>>> ser = bpd.Series(pd.date_range('2000', periods=2, tz="CET"))
>>> ser.to_numpy(dtype=object)
array([Timestamp('1999-12-31 23:00:00+0000', tz='UTC'),
       Timestamp('2000-01-01 23:00:00+0000', tz='UTC')], dtype=object)
Or dtype=datetime64[ns] to return an ndarray of native datetime64 values.
The values are converted to UTC and the timezone info is dropped.
>>> ser.to_numpy(dtype="datetime64[ns]")
array(['1999-12-31T23:00:00.000000000', '2000-01-01T23:00:00.000000000'],
      dtype='datetime64[ns]')
| Parameters | |
|---|---|
| Name | Description | 
| dtype | str or numpy.dtype, optionalThe dtype to pass to  | 
| copy | bool, default FalseWhether to ensure that the returned value is not a view on another array. Note that  | 
| na_value | Any, optionalThe value to use for missing values. The default value depends on  | 
| Returns | |
|---|---|
| Type | Description | 
| numpy.ndarray | A NumPy ndarray representing the values in this Series or Index. | 
to_pandas
to_pandas(
    max_download_size: typing.Optional[int] = None,
    sampling_method: typing.Optional[str] = None,
    random_state: typing.Optional[int] = None,
    *,
    ordered: bool = True
) -> pandas.core.series.SeriesWrites Series to pandas Series.
| Parameters | |
|---|---|
| Name | Description | 
| max_download_size | int, default NoneDownload size threshold in MB. If max_download_size is exceeded when downloading data (e.g., to_pandas()), the data will be downsampled if bigframes.options.sampling.enable_downsampling is True, otherwise, an error will be raised. If set to a value other than None, this will supersede the global config. | 
| sampling_method | str, default NoneDownsampling algorithms to be chosen from, the choices are: "head": This algorithm returns a portion of the data from the beginning. It is fast and requires minimal computations to perform the downsampling; "uniform": This algorithm returns uniform random samples of the data. If set to a value other than None, this will supersede the global config. | 
| random_state | int, default NoneThe seed for the uniform downsampling algorithm. If provided, the uniform method may take longer to execute and require more computation. If set to a value other than None, this will supersede the global config. | 
| ordered | bool, default TrueDetermines whether the resulting pandas series will be ordered. In some cases, unordered may result in a faster-executing query. | 
| Returns | |
|---|---|
| Type | Description | 
| pandas.Series | A pandas Series with all rows of this Series if the data_sampling_threshold_mb is not exceeded; otherwise, a pandas Series with downsampled rows of the DataFrame. | 
to_pickle
to_pickle(path, **kwargs) -> NonePickle (serialize) object to file.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> original_df = bpd.DataFrame({"foo": range(5), "bar": range(5, 10)})
>>> original_df
   foo  bar
0    0    5
1    1    6
2    2    7
3    3    8
4    4    9
<BLANKLINE>
[5 rows x 2 columns]
>>> original_df.to_pickle("./dummy.pkl")
>>> unpickled_df = bpd.read_pickle("./dummy.pkl")
>>> unpickled_df
   foo  bar
0    0    5
1    1    6
2    2    7
3    3    8
4    4    9
<BLANKLINE>
[5 rows x 2 columns]
| Parameter | |
|---|---|
| Name | Description | 
| path | str, path object, or file-like objectString, path object (implementing  | 
to_string
to_string(
    buf=None,
    na_rep="NaN",
    float_format=None,
    header=True,
    index=True,
    length=False,
    dtype=False,
    name=False,
    max_rows=None,
    min_rows=None,
) -> typing.Optional[str]Render a string representation of the Series.
| Parameters | |
|---|---|
| Name | Description | 
| buf | StringIO-like, optionalBuffer to write to. | 
| na_rep | str, optionalString representation of NaN to use, default 'NaN'. | 
| float_format | one-parameter function, optionalFormatter function to apply to columns' elements if they are floats, default None. | 
| header | bool, default TrueAdd the Series header (index name). | 
| index | bool, optionalAdd index (row) labels, default True. | 
| length | bool, default FalseAdd the Series length. | 
| dtype | bool, default FalseAdd the Series dtype. | 
| name | bool, default FalseAdd the Series name if not None. | 
| max_rows | int, optionalMaximum number of rows to show before truncating. If None, show all. | 
| min_rows | int, optionalThe number of rows to display in a truncated repr (when number of rows is above  | 
| Returns | |
|---|---|
| Type | Description | 
| str or None | String representation of Series if buf=None, otherwise None. | 
to_xarray
to_xarray()Return an xarray object from the pandas object.
| Returns | |
|---|---|
| Type | Description | 
| xarray.DataArray or xarray.Dataset | Data in the pandas structure converted to Dataset if the object is a DataFrame, or a DataArray if the object is a Series. | 
tolist
tolist() -> listReturn a list of the values.
These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period).
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3])
>>> s
0    1
1    2
2    3
dtype: Int64
>>> s.to_list()
[1, 2, 3]
| Returns | |
|---|---|
| Type | Description | 
| list | list of the values. | 
transpose
transpose() -> bigframes.series.SeriesReturn the transpose, which is by definition self.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series(['Ant', 'Bear', 'Cow'])
>>> s
0     Ant
1    Bear
2     Cow
dtype: string
>>> s.transpose()
0     Ant
1    Bear
2     Cow
dtype: string
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series. | 
truediv
truediv(other: float | int | bigframes.series.Series) -> bigframes.series.SeriesReturn floating division of Series and other, element-wise (binary operator truediv).
Equivalent to series / other, but with support to substitute a
fill_value for missing data in either one of the inputs.
Examples:
>>> import bigframes.pandas as bpd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> a = bpd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> a
a     1.0
b     1.0
c     1.0
d    <NA>
dtype: Float64
>>> b = bpd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> b
a     1.0
b    <NA>
d     1.0
e    <NA>
dtype: Float64
>>> a.divide(b)
a     1.0
b    <NA>
c    <NA>
d    <NA>
e    <NA>
dtype: Float64
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The result of the operation. | 
unique
unique(keep_order=True) -> bigframes.series.SeriesReturn unique values of Series object.
By default, uniques are returned in order of appearance. Hash table-based unique, therefore does NOT sort.
| Parameter | |
|---|---|
| Name | Description | 
| keep_order | bool, default True **Examples:** >>> import bigframes.pandas as bpd >>> bpd.options.display.progress_bar = None >>> s = bpd.Series([2, 1, 3, 3], name='A') >>> s 0 2 1 1 2 3 3 3 Name: A, dtype: Int64 Example with order preservation: Slower, but keeps order >>> s.unique() 0 2 1 1 2 3 Name: A, dtype: Int64 Example without order preservation: Faster, but loses original order >>> s.unique(keep_order=False) 0 1 1 2 2 3 Name: A, dtype: Int64If True, preserves the order of the first appearance of each unique value. If False, returns the elements in ascending order, which can be faster. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | The unique values returned as a Series. | 
unstack
unstack(
    level: typing.Union[str, int, typing.Sequence[typing.Union[str, int]]] = -1
)Unstack, also known as pivot, Series with MultiIndex to produce DataFrame.
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.DataFrame | Unstacked Series. | 
update
update(
    other: typing.Union[bigframes.series.Series, typing.Sequence, typing.Mapping]
) -> NoneModify Series in place using values from passed Series.
Uses non-NA values from passed Series to make updates. Aligns on index.
Examples:
>>> import bigframes.pandas as bpd
>>> import pandas as pd
>>> import numpy as np
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([1, 2, 3])
>>> s.update(bpd.Series([4, 5, 6]))
>>> s
0    4
1    5
2    6
dtype: Int64
>>> s = bpd.Series(['a', 'b', 'c'])
>>> s.update(bpd.Series(['d', 'e'], index=[0, 2]))
>>> s
0    d
1    b
2    e
dtype: string
>>> s = bpd.Series([1, 2, 3])
>>> s.update(bpd.Series([4, 5, 6, 7, 8]))
>>> s
0    4
1    5
2    6
dtype: Int64
If `other` contains NaNs the corresponding values are not updated
in the original Series.
>>> s = bpd.Series([1, 2, 3])
>>> s.update(bpd.Series([4, np.nan, 6], dtype=pd.Int64Dtype()))
>>> s
0    4
1    2
2    6
dtype: Int64
other can also be a non-Series object type
that is coercible into a Series
>>> s = bpd.Series([1, 2, 3])
>>> s.update([4, np.nan, 6])
>>> s
0    4.0
1    2.0
2    6.0
dtype: Float64
>>> s = bpd.Series([1, 2, 3])
>>> s.update({1: 9})
>>> s
0    1
1    9
2    3
dtype: Int64
value_counts
value_counts(
    normalize: bool = False,
    sort: bool = True,
    ascending: bool = False,
    *,
    dropna: bool = True
)Return a Series containing counts of unique values.
The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([3, 1, 2, 3, 4, bpd.NA], dtype="Int64")
>>> s
0       3
1       1
2       2
3       3
4       4
5    <NA>
dtype: Int64
value_counts sorts the result by counts in a descending order by default:
>>> s.value_counts()
3      2
1      1
2      1
4      1
Name: count, dtype: Int64
You can normalize the counts to return relative frequencies by setting normalize=True:
>>> s.value_counts(normalize=True)
3    0.4
1    0.2
2    0.2
4    0.2
Name: proportion, dtype: Float64
You can get the values in the ascending order of the counts by setting ascending=True:
>>> s.value_counts(ascending=True)
1    1
2    1
4    1
3    2
Name: count, dtype: Int64
You can include the counts of the NA values by setting dropna=False:
>>> s.value_counts(dropna=False)
3       2
1       1
2       1
4       1
<NA>    1
Name: count, dtype: Int64
| Parameters | |
|---|---|
| Name | Description | 
| normalize | bool, default FalseIf True then the object returned will contain the relative frequencies of the unique values. | 
| sort | bool, default TrueSort by frequencies. | 
| ascending | bool, default FalseSort in ascending order. | 
| dropna | bool, default TrueDon't include counts of NaN. | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series containing counts of unique values. | 
var
var() -> floatReturn unbiased variance over requested axis.
Normalized by N-1 by default.
| Returns | |
|---|---|
| Type | Description | 
| scalar or bigframes.pandas.Series (if level specified) | Variance. | 
where
where(cond, other=None)Replace values where the condition is False.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> s = bpd.Series([10, 11, 12, 13, 14])
>>> s
0    10
1    11
2    12
3    13
4    14
dtype: Int64
You can filter the values in the Series based on a condition. The values
matching the condition would be kept, and not matching would be replaced.
The default replacement value is NA.
>>> s.where(s % 2 == 0)
0      10
1    <NA>
2      12
3    <NA>
4      14
dtype: Int64
You can specify a custom replacement value for non-matching values.
>>> s.where(s % 2 == 0, -1)
0    10
1    -1
2    12
3    -1
4    14
dtype: Int64
>>> s.where(s % 2 == 0, 100*s)
0      10
1    1100
2      12
3    1300
4      14
dtype: Int64
| Parameters | |
|---|---|
| Name | Description | 
| cond | bool Series/DataFrame, array-like, or callableWhere cond is True, keep the original value. Where False, replace with corresponding value from other. If cond is callable, it is computed on the Series/DataFrame and returns boolean Series/DataFrame or array. The callable must not change input Series/DataFrame (though pandas doesn’t check it). | 
| other | scalar, Series/DataFrame, or callableEntries where cond is False are replaced with corresponding value from other. If other is callable, it is computed on the Series/DataFrame and returns scalar or Series/DataFrame. The callable must not change input Series/DataFrame (though pandas doesn’t check it). If not specified, entries will be filled with the corresponding NULL value (np.nan for numpy dtypes, pd.NA for extension dtypes). | 
| Returns | |
|---|---|
| Type | Description | 
| bigframes.pandas.Series | Series after the replacement. |