Pandas Dataframe.sample() | Python Last Updated : 11 Apr, 2025 Comments Improve Suggest changes Like Article Like Report Pandas DataFrame.sample() function is used to select randomly rows or columns from a DataFrame. It proves particularly helpful while dealing with huge datasets where we want to test or analyze a small representative subset. We can define the number or proportion of items to sample and manage randomness through parameters such as n, frac and random_state.Example : Sampling a Single Random RowIn this example, we load a dataset and generate a single random row using the sample() method by setting n=1. C++ import pandas as pd # Load dataset d = pd.read_csv("employees.csv") # Sample one random row r_row = d.sample(n=1) # Display the result r_row Outputone row of dataframeThe sample(n=1) function selects one random row from the DataFrame.SyntaxDataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None) Parameters: n: int value, Number of random rows to generate. frac: Float value, Returns (float value * length of data frame values ) . frac cannot be used with n. replace: Boolean value, return sample with replacement if True. random_state: int value or numpy.random.RandomState, optional. if set to a particular integer, will return same rows as sample in every iteration. axis: 0 or 'row' for Rows and 1 or 'column' for Columns. Return Type: New object of same type as caller. To download the CSV file used, Click Here.Examples of Pandas Dataframe.sample()Example 1: Sample 25% of the DataFrameIn this example, we generate a random sample consisting of 25% of the entire DataFrame by using the frac parameter. C++ import pandas as pd d = pd.read_csv("employees.csv") # Sample 25% of the data sr = d.sample(frac=0.25) # Verify the number of rows print(f"Original rows: {len(d)}") print(f"Sampled rows (25%): {len(sr)}") # Display the result sr Output25% of dataframe As shown in the output image, the length of sample generated is 25% of data frame. Also the sample is generated randomly. Example 2: Sampling with Replacement and a Fixed Random StateThis example demonstrates how to sample multiple rows with replacement (i.e., allowing repetition of rows) and ensures reproducibility using a fixed random seed. C++ import pandas as pd d = pd.read_csv("employees.csv") # Sample 3 rows with replacement and fixed seed sd = d.sample(n=3, replace=True, random_state=42) sd Outputsampling with replacementThe replace=True parameter allows the same row to be sampled more than once, making it ideal for bootstrapping. random_state=42 ensures the result is reproducible across multiple runs very useful during testing and debugging. Comment More infoAdvertise with us Next Article Python | Pandas dataframe.info() K Kartikaybhutani Follow Improve Article Tags : Misc Python Python-pandas Python pandas-dataFrame Pandas-DataFrame-Methods +1 More Practice Tags : Miscpython Similar Reads Pandas Functions in Python: A Toolkit for Data Analysis Pandas is one of the most used libraries in Python for data science or data analysis. It can read data from CSV or Excel files, manipulate the data, and generate insights from it. Pandas can also be used to clean data, filter data, and visualize data. Whether you are a beginner or an experienced pro 6 min read Pandas Read CSV in Python CSV files are the Comma Separated Files. It allows users to load tabular data into a DataFrame, which is a powerful structure for data manipulation and analysis. To access data from the CSV file, we require a function read_csv() from Pandas that retrieves data in the form of the data frame. Hereâs a 6 min read Pandas Dataframe/Series.head() method - Python The head() method structure and contents of our dataset without printing everything. By default it returns the first five rows but this can be customized to return any number of rows. It is commonly used to verify that data has been loaded correctly, check column names and inspect the initial record 3 min read Pandas Dataframe/Series.tail() method - Python The tail() method allows us to quickly preview the last few rows of a DataFrame or Series. This method is useful for data exploration as it helps us to inspect the bottom of the dataset without printing everything. By default it returns the last five rows but this can be customized to return any num 3 min read Pandas Dataframe.sample() | Python Pandas DataFrame.sample() function is used to select randomly rows or columns from a DataFrame. It proves particularly helpful while dealing with huge datasets where we want to test or analyze a small representative subset. We can define the number or proportion of items to sample and manage randomn 2 min read Python | Pandas dataframe.info() When working with data in Python understanding the structure and content of our dataset is important. The dataframe.info() method in Pandas helps us in providing a concise summary of our DataFrame and it quickly assesses its structure, identify issues like missing values and optimize memory usage.Ke 2 min read Pandas DataFrame dtypes Property | Find DataType of Columns Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Pandas DataFrame.dtypes attribute returns a series with the data type of each column.Example:Pythonimport pandas as pd df = pd.DataFrame({'Weight': [45, 88, 56, 3 min read Pandas df.size, df.shape and df.ndim Methods Understanding structure of our data is an important step in data analysis and Pandas helps in making this easy with its df.size, df.shape and df.ndim functions. They allow us to identify the size, shape and dimensions of our DataFrame. In this article, we will see how to implement these functions in 2 min read Pandas DataFrame describe() Method The describe() method in Pandas generates descriptive statistics of DataFrame columns which provides key metrics like mean, standard deviation, percentiles and more. It works with numeric data by default but can also handle categorical data which offers insights like the most frequent value and the 4 min read Python | Pandas Series.unique() Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages, and makes importing and analyzing data much easier. While analyzing the data, many times the user wants to see the unique values in a par 1 min read Like