100% found this document useful (1 vote)
201 views5 pages

The Semi Colon Is Used at The End of Proc SQL Statement

The document discusses various SQL statements and functions used for data manipulation in SAS including: 1) The SEMI-COLON is used at the end of PROC SQL statements and to separate clauses within statements. PROC SQL is used to create tables, insert values, select data, and perform calculations. 2) Functions like COUNT, MIN, SUM, and COALESCE can be used to aggregate or replace missing values in a dataset. 3) The WHERE, GROUP BY, HAVING, ORDER BY clauses filter and sort data. Logical operators LIKE AND and OR can be used in the WHERE clause. 4) JOINS merge datasets based on common variables. The different join types - INNER

Uploaded by

Om Prakash
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
201 views5 pages

The Semi Colon Is Used at The End of Proc SQL Statement

The document discusses various SQL statements and functions used for data manipulation in SAS including: 1) The SEMI-COLON is used at the end of PROC SQL statements and to separate clauses within statements. PROC SQL is used to create tables, insert values, select data, and perform calculations. 2) Functions like COUNT, MIN, SUM, and COALESCE can be used to aggregate or replace missing values in a dataset. 3) The WHERE, GROUP BY, HAVING, ORDER BY clauses filter and sort data. Logical operators LIKE AND and OR can be used in the WHERE clause. 4) JOINS merge datasets based on common variables. The different join types - INNER

Uploaded by

Om Prakash
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 5

*the semi colon is used at the end of proc sql statement, after quit and the statement before

quit...the variable mentioned as char, we have to mention the length of the variable; proc sql; create table emp (id num,name char(20), doj num informat date9. format ddmmyy10.); quit; *to feed the values in the variable, can be done by 2 types..set and values; *example of set option; proc sql; insert into emp set id=101, name="Deepika",doj="01Jan2009"d set id=102, name="Smita",doj="05Mar2010"d; quit; *example of values command; proc sql; insert into emp values(101,"Deepika","01Jan2009"d) values(102,"Surya","02Feb2008"d); quit; -----------------------------------------*data manipulation; *proc sql(sequence to be followed) select= picking variables from= dataset name where=condition on every observation group by=grouping variable having=condition order by=for sorting quit; * astrik is used to pick all the variables from the dataset; proc sql; select * from local.baseball; quit; *to select few variables from the dataset; proc sql; select model,type,country from local.cars; quit; *to pick unique values in respect to a variable; proc sql; select distinct league

from local.baseball; quit; *to select unique variations in a dataset; proc sql; select distinct * from local.baseball; quit; *to select obs from a dataset; proc sql inobs=4 outobs=10; select * from local.cars; quit; *to use a function on a dataset; proc sql; select count(no_hits) as total_hits from local.baseball; quit; *too add this new variable in a new a table; proc sql; create table rajan as select *, min(no_hits) as total_hits from local.baseball; quit; *to perform calculation on the base of two variable; proc sql; select origin,dest,(capacity-deplaned) as diff format 10.,capacity,deplaned from flights; quit; proc sql; select (no_hits/sum(no_hits)) as percent format 5.2 from local.baseball; quit; *to show it terms of percentage; proc sql; select (no_hits/sum(no_hits)) as percent format percent5.2 from local.baseball; quit; *having is used for the variables where the calculation is done and an aggregation is done, ex sum..var..count...and for for order by, calculater command is not required; proc sql;

select (no_hits/sum(no_hits))*100 as percent format 5.2 from local.baseball having calculated percent>5; quit; *we can use and & or statements in the where conditions; proc sql; select origin,dest,(capacity-deplaned) as diff format 10.,capacity,deplaned from flights where calculated diff>19; quit; libname files "C:\Documents and Settings\Hcl\Desktop\files\files"; *to replace missing values with a value; proc sql; select name,coalesce(lowpoint,"NA") as lowpoint from files.continents; quit; *to replace a missing numeric value with a value defined using coalesce; proc sql; select name,coalesce(area,12334) as area from files.continents; quit; proc sql; create table cont as select *, coalesce(area,12334) as area1 from files.continents; quit; *to use if and else if commands, here we use case and when satements; *the comma after latitude defines that there is a variable to be ceated; *we are not making any changes in the data set and directly creating a report; proc sql; select city,latitude, case when latitude<-23 then "North Frigid" when latitude between -23 and 23 then "Temperate" when latitude >23 then "South Frigid" end as climate_zone from files.worldcitycoords; quit; *group by; proc sql; select category,sum(units) from local.candy_sales_summary group by category; quit; proc sql;

select dest, sum(capacity) from flights group by dest; quit; proc sql; select dest,sum(deplaned) from flights group by dest having deplaned>230; quit; proc sql; select category,units from local.candy_sales_summary where units>2000 order by category,units desc; quit; *joins/merge, the variable name can be different and sorting is not required, will create a report and not a data set; *only 4 types of joins; *exact=inner join inner=full join left inner=left join right inner=right join; data set1; input order_no order_amt deliverydate; informat deliverydate date9.; format deliverydate ddmmyy.; datalines; 101 2000 09jan2005 201 5000 10feb2006 304 6000 15mar2007 ; run; data set2; input CN$ order_no Purchasequantity; datalines; A1189 101 10000 B2453 451 1000 A3564 201 1500 ; run; *example of inner join; proc sql; select y.order_no,order_amt,purchasequantity from set1 as x right join set2 as y on x.order_no=y.order_no;

quit;

You might also like