We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4
see few applications reside in Sybase actually.
So as we know, Sybase was the
outdated one we were involved to migrate the Sybase applications to Oracle. As we know Oracle is the optimized one than Sybase so we had to create the objects from table onwards like creating the column onwards we were involved, and whatever the procedures present in the Sybase we were involved to migrate the same into Oracle by some of the optimized techniques. Yes, yes, yes. Correct. It was like around one and a half a year to two years. See, it was like so many applications reside in Sybase language. So we were involved to migrate them, because they have to analyze the project and we have to implement in Oracle and then to in development, and then to production. And also then we have to work on defects. It will definitely take long time. It's not like just database upgrading. It is like applications, we have to migrate from Sybase to Oracle and we have to do testing with Java environment as well. Application migrations, application migration, application migration. Front end was Java. There were 20 applications. See, I'm just telling you clearly there were 20 around applications were there inside insight base their data was in and their front end was in Java. So we need to migrate from one system to another system where we need to integrate with Oracle with the Java actually. So the thing is, we have to analyze the things what are the things we need to implement and all and then like development and then production and then defects? It took that long? I don't know why you are asking about like why Since I started my career very late, like after four years gap, so I joined as a contract employee. Like I have only SQL SQL and PL SQL and Azure like they have given me the trainings on cloud computing. So I came from SQL background to Azure Azure Blob Storage Azure Data Lake Gen to Azure Data Factory and Azure synapse, the data warehouse and logic apps we have used for one of the triggering mail alerts and data bricks we use should in basic level for validation purpose and Azure SQL database. And like, they recently started giving the training on it. So we are just having the subscription. So we just started doing like validation purpose before running the pipeline. We need to check whether source has implemented some of the changes in the source file or not. So with that, we just is loading that file into data frame and checking on that. So I can say basic level still we are in learning phase Yeah. Okay in my project we are using Visual Studio 2022 where first we need to, from remote to local, we have to capture the changes. And then like we have to create a development feature branch over there. Under get solutions we have to go to commit all option with the valid command. And then if you come down you will see one pull request. So where you have to pull the changes from our feature branch to development branch it will basically validate whether the same changes are done by any other team members or not. Once we are Once we are done with that, then when we go to DevOps boat over there repositories under that, you can check the development branch pull request over there, so once you create the pull request, like from development branch to staging branch or master branch, then we can pull the changes from development to master branch like the reviewer will review the changes and send the changes to master branch and it will do it will create a continuous integration pipeline and a release pipeline where itself it creates the artifacts like DAC pack file at the end which have the summary of the changes and the code and infrastructure We do have this scaler script handy like config variable and mount variable so we just eat reusing the same thing. By changing the data store name and key values like using access keys. We can perform this like using shared access signature as well. And then we can read inside pi spark using DF equal to spark dot read dot CSV followed by the path of the file. So as per coming to data bricks, I just know basic level of the knowledge not in too much actually. So I, I know only basic level knowledge in o in the case of Azure Data Factory flatten transformation is used to if you have a column which is consist of multiple values and you want separate records in, in that target then we can go with flat and transformations in Azure Data Factory. Delta Lake is actually having the features of data warehouse and also like it can provide schema versioning and also like preserve ants of schema, code versioning and then like ACID properties on the transactions. Whereas data lake is actually you can store I can say data lake holds the data of any type or any size and allows us to do some analytics also. So it is like a two in one option storing and analyzing, so data lakes stores, semi structured and structured and unstructured data. And the data will be chunked if the data is greater than two GB and replicate in three different regions so we can process the data parallely I can say delta Lake actually turns into staging area I funny job space also So, like, if it is like customized mapping we have to customize it but you said we have multiple file so we cannot go and do manual inter intervention. So here we can otherwise if the if you don't have any tables you can select the option Create table. So itself it creates a table in the target side I'm not sure I'm not able to recollect it actually usually if you yes yes we can do by using mapping data flows also. Like using selector clauses we can do it all the columns will be fetched from the file, whatever the columns present in the source file and will be fetched, and then we can move them to the target. Actually the thing is when you have a SELECT clause which is like first we need to access the source file correct inside the mapping data flow over there. We have to keep the SELECT clause like which is copying the all the column names from the source file. so here we can add a rule like under clicking like add mapping and then we can select rule based mapping. So here like two inputs we have to give the condition on which we have to match and name of the map column. So both the values we have to give Yeah, Azure Data Lake is actually see we can dump any type of data and irrespective of any size of the data, it will store structured and semi structured and unstructured data and we no need to bother about the data size or like cleaning of the data and necessary data and all so the data will be divided if the data is larger than two GB and stored in chunks format. Whereas data warehouse consists of the data in structured format like tables, columns and rows. So data lakes support some of the analytical tools as well. So it is providing a storage and analysis option like two in one option by using Azure Data Lake Analytics and also like machine learning or odd scalar. We can spark using this we can analyze the data in data lake So if we are using Azure Data Factory to move the data from on premises to SQL Server, then it would be like first we need to create a linkerd service with the swords as the like on premise server. First, we need to download a self hosted integration runtime on on premises and execute the dot exe file and copy paste the same key while creating a linker service though. So then going forward, this particular linker service will, this particular virtual machine will act as the self hosted integration runtime to our pipeline. So basically, integration runtime provides a computational power to maintain or create the infrastructure and manage the same. And then like, we have to create an activity called Get metadata activity, which will fetch all the files from that particular folder. And this, this files can be passed through the for each activity by creating a parameter and adding the dynamic content as at the rate of output dot activity of get metadata activity dot child item, which is like a child item referring to the file name and has been passed through the for each and inside for each we can keep copy data activity to capture the file name from the particular for each by using a parameter called like item dot name. So it fetches the file name and copies to the destination. So if you're configuring the target as blob storage, then we have to create a linkerd service for the blob storage by having access key as a authentication method. And then we can configure integration runtime would be as your default integration runtime. And data set would be like CSV file format, where we need to pick up that delimited text format while creating a data set. Using group by function having count of customer ID is greater than one we can achieve this. If the table is too huge then we can go with row number function as a done ranking function yes yes ranking functions see actually while creating a linkerd surveys if you are writing the credentials over there it will definitely do the data bleach actually. So in order to avoid that we are storing the credentials inside Key Vault and fetching them dynamically. So, first we need to create a linked service with the key vault and then like we can fetch the values so Key Vault stores the keys in encrypted form or like cryptographic form for security purpose. First, we need to go to Key Vault and over there import our there like upload manually. We have to upload the credentials We are applying the transformations using mapping data flows like SELECT clause derived column active transformations look up transformations group by functions, aggregate functions and joins with X reference tables and also like abstraction and encryption in the case of like very sensitive data we do have locked table. So, at the end of the pipeline data will be recording into the lock table. So, we will be fetching the results from the rock lock table using triggers like normal scheduling trigger and tumbling window trigger and event based triggers if we want to run the pipeline for periodic intervals of time, then we can go with this and also we can set the dependency like if my pipeline is dependent on any other pipeline data then we can go with this and also like if we want to fetch some of the data for the past dates then we can go with that yes, we do we do have some dependency and few jobs are like to like trigger at periodic intervals of time. Yes, like we are getting some of the transactional files right. So, it would be like for every four hours in a day, we are running it four times in a day for every four hours. So, we are just using this tumbling window trigger. So, as soon as it triggers the pipeline will be executed. So, we do have the buffer time within that period only. Yes event based triggers are used to trigger the pipeline based on the arrival or the deletion of the file but we are going with the arrival of the file. As soon as the suppose if we if we are maintaining the sales data then like as soon as the sales files will be arrived in the Blob Storage then our trigger will initiate the pipeline link ID service is the connection string with the any of the data store. So while creating the or like establishing the connection between any of the data store with Azure Data Factory, we need to fill all the required fields for the link ID service like authentication, authorization and subscription and data store name and integration runtime, whereas data set is actually created for the source and destination to show the exact location of the data which we are going to access with Azure Data Factory. Going with the get meta data activity by passing that particular input folder it will fetch all the files from that particular folder, but you can filter them using last modified date under the settings of the source. So if you want to fetch only like today's files, then we can go with that. If you have any wildcard pattern to fetch the file name, then over there itself, wildcard pattern you can give it. Yes, if we can get over there like dynamic content under the file name to capture that one, by giving the concatenation operation of that particular file name by appending the date of that particular day. See under the file name, we have to give the dynamic content over there we can give that one so, if you are getting multiple files correct in the gait meta data activity, back that can be passed through he for each over there we can give the dynamic content like at the rate of output dot activity of get meta data activity dot child item so here child item means the file name, but you're fetching only the files which is matching with the wildcard pattern with the given thing. So inside the wildcard pattern you can give like star underscore date. Using derived column transformations we can do it. Dangling window trigger is actually to invoke the pipeline for frequent intervals of time suppose I want to trigger the pipeline from seven to 9pm window then we can go with this so it will trigger the same pipeline into multiple windows like a seven to eight and eight to 9pm. And also we can fit some of the properties like window start time and window end time. And also like we can fetch the data for the past dates as well. And also we can set the dependency like my pipeline is dependent on any other pipeline then I can set the dependency until our analysts that pipeline is triggered and then only. Not sure can you put the same question in different way
Building Serverless Apps with Azure Functions and Cosmos DB: Leverage Azure functions and Cosmos DB for building serverless applications (English Edition)
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp