Insight Mind Sdn Bhd

Insight Mind Sdn Bhd

Teknologi Maklumat dan Perkhidmatan

Bandar Puchong Jaya, Selangor 1,195 pengikut

Empowering Actionable Insights

Perihal kami

Our Mission: To improve the quality of life of the communities we serve by leveraging the power of technology and data towards maximizing productivity. Our core values: Innovation – We demonstrate our strength in the technology we create for our clients, and we value innovative efforts, ideas and methods to continually improve our business Self Responsibility – We will be responsible and be accountable towards our action Integrity- We exhibit honesty, openness, fairness, and integrity at all times when dealing with the team, partner, and clients. Growth- We celebrate failure as an opportunity for growth. We admit mistakes and learn from them. Heart – We seek to have a positive impact by always putting our partner, client and team first Trustworthy- We believe that being trustworthy and showing integrity in our daily lives is the key to long-standing client relationships and loyalty within our team.

Laman web
https://www.insightmind.com.my
Industri
Teknologi Maklumat dan Perkhidmatan
Saiz syarikat
2-10 pekerja
Ibu pejabat
Bandar Puchong Jaya, Selangor
Jenis
Milik Persendirian
Ditubuhkan
2017
Pengkhususan
Information and Communication Technology, Internet of Things, Data Science, Innovation, Design Thinking, Project Management, ICT Strategic Planning, ICT Training, Leadership, Agriculture Insights, HR Insights, Data Driven, Business Analytics

Lokasi

  • Utama

    SS-02-20, Skypod Square,

    Persiaran Puchong Jaya Selatan

    Bandar Puchong Jaya, Selangor 47100, MY

    Dapatkan arah

Pekerja di Insight Mind Sdn Bhd

Kemas Kini

  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat profil Dipankar Mazumdar, M.Sc 🥑, grafik

    Staff Data Engineer Advocate @Onehouse.ai | Apache Hudi, Iceberg Contributor | Author of "Engineering Lakehouses"

    Data Warehouse -> Data Lakes -> Data Lakehouse. Data architectures have evolved over time, depending on the 'type' of workload it needs to serve. Nowadays, more & more organizations have been thinking of and adopting open lakehouse architectures. The reason is actually quite simple! To start with: ✅ customers have the flexibility to store data in open storage formats (table + file) ✅ every component is modular, which means flexibility in terms of bringing the best tools/software ✅ customers own/control their cloud storage (such as S3 bucket/MinIO etc.) ✅ they can work on the same data with multiple compute engines (BI, Streaming, ML use cases) These aspects have resonated with orgs suffering with problems like - ❌ increasing storage & compute costs ❌ unable to manage multiple data copies ❌ need to maintain a 2-tier architecture (data warehouse + data lake), among other pains The modularity ("de-bundled database") in a lakehouse is probably one of the most attractive reasons to adopt lakehouse. It allows you to be flexible & select the best component for your use case with all the benefits of scalable storage, low cost & data management services such as compaction, clustering, cleaning. For example: - Best of Compute: you can use a compute that is performant for your use case (Spark for distributed ETL, Flink for stream processing, maybe DuckDB/Daft for single node workloads) - Open Table Format: choice of open table formats (Apache Hudi, Apache Iceberg, Delta Lake)for transactional capabilities and open storage. - Catalog: depending on ecosystem & integrations you can work with AWS Glue or Unity Catalog, etc. Now, while the table formats has provided scope for openness, it is important to recognize that an 'open' data architecture needs more than just open table formats. It requires: interoperability across formats (Apache XTable (Incubating)), catalogs, & open compute services for table management services such as clustering, compaction, and cleaning to also be open in nature. These are factors that cannot be ignored as we head to the next phase of lakehouses. Read more about it in my blog (link in comments). #dataengineering #softwareengineering

    • Tiada penerangan teks alternatif diberikan bagi imej ini
  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat profil Eric Partaker, grafik
    Eric Partaker Eric Partaker ialah Orang Yang Berpengaruh

    The CEO Coach | CEO of the Year '19 | McKinsey, Skype | Author | Follow for posts about business, leadership & self-mastery.

    Great leaders practice kindness as a genuine act - not as a calculated strategy. As Simon Koerner shares in this excellent post (be sure to give him a follow), being authentically kind is a cornerstone of a leader's credibility. Research highlights acts of kindness by leaders can: ✅️ Elevate workplace atmosphere ✅️ Increase employee retention ✅️ Boost team well-being ✅️ Drive team success Here’s how exceptional leaders show kindness to create meaningful impact: ➡️ Active Listening Genuinely listening to employees' concerns and ideas fosters a sense of value and understanding. ➡️ Expressing Gratitude Acknowledging and appreciating employees' efforts regularly can significantly improve morale and motivation. ➡️ Providing Constructive Feedback Feedback that is positive and actionable helps employees grow while feeling supported. ➡️ Showing Empathy Understanding and sharing others' feelings can strengthen connections and create a positive workplace. ➡️ Encouraging Work-Life Balance Helping employees achieve a healthy balance reduces stress and prevents burnout. ➡️ Mentorship and Development Investing in employees' growth through mentorship demonstrates a leader’s commitment to their success. ➡️ Acts of Service Small gestures, like lending a hand or providing needed resources, show care and support. ➡️ Creating a Positive Work Environment Promoting an inclusive and uplifting culture enhances engagement and reduces turnover. ➡️ Transparent Communication Being open and honest builds trust and alleviates uncertainty. ➡️ Celebrating Successes Recognizing both individual and team accomplishments boosts morale and fosters pride in achievements. Kindness is a choice. And as a leader, you should choose kindness every day. PS: Which of these acts do you think would make the biggest difference in your workplace? ♻ Repost to help the leaders in your network. And follow Eric Partaker for more. 📌 Secure your spot for one of my last two FREE TALKS happening TODAY & SATURDAY! Limited spaces available. To celebrate the 9th cohort of the Peak Performance Program, opening on Wednesday, November 20th, I am holding two final free training sessions. During these sessions, I'll share three key peak performance principles that have helped 1000s of my clients perform at their best both professionally and personally - before opening up for Q&A. Learn more about the Peak Performance Program here: https://lnkd.in/ddfBQEJC And register for whichever session suits you best below: Thursday, November 14th, 5:30pm UK / 12:30pm Eastern https://lnkd.in/dKkBaY4j Saturday, November 16th, 4pm UK time / 11am Eastern https://lnkd.in/diPHJYpS

    • Tiada penerangan teks alternatif diberikan bagi imej ini
  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat profil Sravya Madipalli, grafik

    Senior Manager, Data Science| Ex-Microsoft

    5 𝗘𝘀𝘀𝗲𝗻𝘁𝗶𝗮𝗹 𝗗𝗮𝘁𝗮 𝗖𝗹𝗲𝗮𝗻𝗶𝗻𝗴 𝗧𝗲𝗰𝗵𝗻𝗶𝗾𝘂𝗲𝘀 𝗶𝗻 𝗣𝘆𝘁𝗵𝗼𝗻 & 𝗦𝗤𝗟 Data cleaning is an essential, often underappreciated, part of data science. A clean dataset can mean the difference between accurate insights and misleading results. Here are five data cleaning techniques with examples in Python and SQL to make your analysis smoother: 1. 𝗛𝗮𝗻𝗱𝗹𝗲 𝗡𝘂𝗹𝗹 𝗩𝗮𝗹𝘂𝗲𝘀 𝗧𝗵𝗼𝘂𝗴𝗵𝘁𝗳𝘂𝗹𝗹𝘆: Replace missing values based on your data's nature. For example, you might use the median for missing numeric values. Python: df['purchase_amount'].fillna(df['purchase_amount'].median(), inplace=True) SQL: UPDATE CustomerData SET purchase_amount = (SELECT MEDIAN(purchase_amount) FROM CustomerData) WHERE purchase_amount IS NULL; 2. 𝗦𝘁𝗮𝗻𝗱𝗮𝗿𝗱𝗶𝘇𝗲 𝗗𝗮𝘁𝗲𝘀: Keep date formats consistent to prevent errors in time-based analyses. Python: df['purchase_date'] = pd.to_datetime(df['purchase_date']) SQL: UPDATE CustomerData SET purchase_date = CAST(purchase_date AS DATE); 3. 𝗜𝗱𝗲𝗻𝘁𝗶𝗳𝘆 𝗮𝗻𝗱 𝗠𝗮𝗻𝗮𝗴𝗲 𝗢𝘂𝘁𝗹𝗶𝗲𝗿𝘀: Detect outliers using Interquartile Range (IQR) or similar methods, then decide if capping or transformation is needed. Python: Use IQR to cap extreme values. 𝗦𝗤𝗟: WITH Percentiles AS ( SELECT PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY purchase_amount) AS Q1, PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY purchase_amount) AS Q3 FROM CustomerData ), Bounds AS ( SELECT Q1 - 1.5 * (Q3 - Q1) AS lower_bound, Q3 + 1.5 * (Q3 - Q1) AS upper_bound FROM Percentiles ) UPDATE CustomerData SET purchase_amount = (SELECT upper_bound FROM Bounds) WHERE purchase_amount > (SELECT upper_bound FROM Bounds); 4. 𝗦𝘁𝗮𝗻𝗱𝗮𝗿𝗱𝗶𝘇𝗲 𝗖𝗮𝘁𝗲𝗴𝗼𝗿𝗶𝗰𝗮𝗹 𝗩𝗮𝗹𝘂𝗲𝘀: Avoid duplicate categories by making values consistent (e.g., converting ‘Calif.’ and ‘California’ to ‘CA’). Python: df['state'] = df['state'].replace({'Calif.': 'CA', 'California': 'CA'}) SQL: UPDATE CustomerData SET state = 'CA' WHERE state IN ('Calif.', 'California'); 5. 𝗧𝗿𝗶𝗺 𝗪𝗵𝗶𝘁𝗲 𝗦𝗽𝗮𝗰𝗲𝘀: Remove extra spaces to prevent hidden mismatches in text fields. Python: df['name'] = df['name'].str.strip() SQL: UPDATE CustomerData SET name = TRIM(name); Data cleaning might seem like a small task, but these adjustments can make all the difference in producing reliable insights. What’s your go-to data-cleaning technique? ♻️ Repost this if you found it useful!

  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat profil Abhisek Sahu, grafik

    75K LinkedIn |Senior Azure Data Engineer ↔ Devops Engineer | Azure Databricks | Pyspark | ADF | Synapse| Python | SQL | Power BI

    𝐄𝐓𝐋 𝐩𝐫𝐨𝐜𝐞𝐬𝐬𝐞𝐬 𝐮𝐬𝐢𝐧𝐠 𝐏𝐲𝐒𝐩𝐚𝐫𝐤: 👉 ETL which stands for (Extract, Transform, Load), are commonly implemented using PySpark, a powerful framework for large-scale data processing. 👉 PySpark provides APIs in Python for working with big data, particularly within the Apache Spark ecosystem. 𝐈'𝐥𝐥 𝐨𝐮𝐭𝐥𝐢𝐧𝐞 𝐚 𝐛𝐚𝐬𝐢𝐜 𝐄𝐓𝐋 𝐩𝐫𝐨𝐜𝐞𝐬𝐬 𝐮𝐬𝐢𝐧𝐠 𝐏𝐲𝐒𝐩𝐚𝐫𝐤: 📕 𝐄𝐱𝐭𝐫𝐚𝐜𝐭:  The first step is to extract data from various sources. PySpark supports extracting data from a variety of sources such as files (CSV, JSON, Parquet, etc.), databases (MySQL, PostgreSQL, etc.), and distributed storage systems (HDFS, Amazon S3, etc.). 📕 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦:  After extracting the data, you perform transformations on it to clean, filter, aggregate, or manipulate it according to your requirements. PySpark provides a rich set of functions for these transformations.  Transformations are performed majorly on DataFrames. 📕 𝐋𝐨𝐚𝐝:  Finally, the transformed data is loaded into a target destination, which could be a database, a data warehouse, or another storage system. Here sharing a pyspark cheat sheet document that contains commonly used pyspark functions and its syntax.  Credit: Waleed Mousa 🔈 Join DE Channel : https://lnkd.in/gy4R55Tj 🤝 Follow 👨💻Abhisek Sahu for a regular curated feed of Data Engineering insights and valuable content! Please Like, repost ✅, if you find them useful. #dataengineering #apachespark #pyspark #databricks #bigdata #dataengineer

  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat laman organisasi python coding, grafik

    619 pengikut

    Python List Summation: Loop vs Built-in Function Comparison Explore two methods to sum elements in a Python list: a manual loop vs the `sum()` function. Learn the advantages of each and improve your Python skills! #Python #CodeComparison #ProgrammingTips python,python tutorial,python zip built-in function,python programming,python for beginners,python lists,learn python,python functions,python tutorial for beginners,list in python,sum of elements in the list,python list,how to append elements in list in python,built-in function,python function parameters,how to calculate the sum of elements in a list in python,python 3,python basics,loops in python,python tutorials #Python #ListSummation #CodeComparison #ProgrammingTips #PythonSnippet #codeaj #codeajay #pythoncoding4u #pythoncoding

  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat profil Vu Trinh, grafik

    I write for 5k+ readers at vutr.substack.com

    🚀🚀 How does Netflix ensure the data quality for thousands of Apache Iceberg tables? Internally, thousands of Apache Iceberg data tables cover all aspects of Netflix's business. For the data audits, Netflix employs the WAP (Write-Audit-Publish) pattern. They first write the data to a hidden Iceberg snapshot and then audit it using an internal data auditor tool. If the audit passes, this snapshot is exposed to the user. In this week’s article, I will explore how Netflix relies on the WAP pattern to audit thousands of Iceberg tables, learn the general idea of WAP, revisit Apache Iceberg specifications and find out how they enable the WAP pattern, and finally, check out the typical Iceberg WAP process. You can find my detailed article here: https://lnkd.in/grDY7NkA ♻️ If you find my work valuable, please repost it so it can reach more people. #dataengineering #dataanalytics #tableformat #lakehouse #datalake #datawarehouse #apacheiceberg

    • Tiada penerangan teks alternatif diberikan bagi imej ini
  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat profil Soumil S., grafik

    Sr. Software Engineer | Big Data & AWS Expert | Apache Hudi Specialist | Spark & AWS Glue| Data Lake Specialist | YouTuber

    Fetch Files Incrementally from S3, MinIO, or Local Directories for Data Processing! In my latest blog, I share a simple Python template that allows you to fetch files incrementally from S3 (s3://), MinIO (s3a://), or local directories (file://). The template makes it easy to track processed files with checkpoints, and can be used to integrate with Spark for further processing. Whether you're working with Hudi, Iceberg, or Delta, this class enables seamless incremental file fetching and processing. This approach streamlines your workflow, making it easier to handle large-scale data processing. 👉 Read the full blog now: #DataEngineering #Spark #FileProcessing #IncrementalProcessing #Hudi #Iceberg #DeltaLake #MinIO #AWS #Python #BigData

    Simple Python Utility Class for Incremental File Retrieval and Processing (CSV, JSON, Parquet, Avro) from Local or Cloud Storage (file://,S3://, S3a:)

    Simple Python Utility Class for Incremental File Retrieval and Processing (CSV, JSON, Parquet, Avro) from Local or Cloud Storage (file://,S3://, S3a:)

    Soumil S. di LinkedIn

  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat laman organisasi python coding, grafik

    619 pengikut

    Typecasting Across Float and Int in Python See how typecasting works in Python when casting across float and int. python course,python tutorial,python,python tutorial for beginners,python projects,python full course,python for beginners,python programming,type cast in python,typecast in python,python data type casting,implicit typecasting,python mcq questions,typecasting,python mcq online test,explicit typecasting,typecasting in c,types of typecasting,python mcq questions and answers,python mcq,complex data type python,python complex data type,python quiz Answer ---> A) 15 Explainations ---> The string '10.5' is first converted to a float, then to an integer (10), and finally, 5 is added, resulting in 15. #Python #Typecasting #Coding #codeaj #codeajay #pythoncoding4u #pythoncoding

  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat profil Adam Schroeder, grafik

    Growing the most powerful data app community in Python | Plotly's Community Manager

    More about Prompt Engineering Techniques. In this month's Charming Data AI for NYC Meetup we were proud to have Harish Sista present on the difference between zero-shot prompting, multi-shot prompting, chain of thought, and Generated Knowledge prompting. Video in the comments. Thank you Harish for joining us and sharing your knowledge.

  • Insight Mind Sdn Bhd memaparkan semula ini

    Lihat profil Aditya Chandak, grafik

    20K LinkedIn | Freelancer | Data Architect | BI Consultant | Azure Data Engineer | AWS | Python/Pyspark | SQL| Snowflake | PowerBI |Tableau

    Data Engineer Interview! Imagine you're running a small coffee business and want to improve your operations by understanding your data better. Let's walk through how the Data Engineering Workflow can help you turn raw information into valuable insights. 1. Data Sources - Think of this as your raw materials—all the data coming into your business. This could be: - Sales data from your POS systems. - Customer reviews from social media. - Inventory updates from suppliers. - Weather data to help you predict busy days. 2. Data Ingestion - Imagine you need to move all that raw data into a central place for processing. - Using tools like NiFi and Kafka, data ingestion collects and transports your data (both historical and real-time) into your processing system. 3. Data Processing - Now that your data is gathered, it’s a bit messy. You need to clean and organize it. - Tools like Apache Spark help process this data, transforming it from a raw form into something understandable. 4. Data Storage - Your data is ready and needs a safe place to live. Think of Data Storage as your digital warehouse. - You store it in databases like Snowflake or Data Lakes. 5. Data Integration - Your data doesn’t work alone. Just like organizing different ingredients in a recipe, you need Data Integration to combine and manage data workflows. - Using a tool like Apache Airflow, this stage ensures that data flows smoothly between all stages, allowing everything to work together. 6. Data Access - Now, you want to look at this data to make decisions. - You access it through tools like Jupyter notebooks, where you can query and analyze the data. 7. Monitoring & Logging - Just as you monitor your coffee machines, you need to monitor your data systems to ensure they work smoothly. - Monitoring tools keep an eye on the entire pipeline and log errors. 8. Data Governance & Security - Data is valuable, so it’s crucial to protect and organize it. - Data Governance ensures only authorized people can access sensitive data, and it maintains data quality. 9. Machine Learning & Advanced Analytics - Here’s where things get smart! With Machine Learning, you can use the data to predict future sales, forecast inventory needs, or personalize customer experiences. 10. Data Visualization & Reporting - Finally, all this processed data needs to be presented in an easy-to-understand way. - Using tools like Plotly for visualization, you can create charts, reports, and dashboards. Summary From gathering raw data to analyzing trends and forecasting, this workflow lets you run your business smarter. Imagine waking up each morning to a dashboard with updated insights on what products are hot, inventory needs, and even customer sentiment from social media—all powered by data engineering! This seamless flow helps you make data-driven decisions without getting bogged down by manual analysis, enabling you to focus on growing your business.

    • Tiada penerangan teks alternatif diberikan bagi imej ini

Laman yang serupa