ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Extract Data from .csv file with Python

    IT Discussion
    6
    19
    4.6k
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • LakshmanaL
      Lakshmana
      last edited by Lakshmana

      I am new to Python but need to autofilter the data from the excel sheet according to the Engineer name and Age of the tickets in the excel.I need to filter the data above 15 Days and copy to the another sheet of the excel.Is this possible through Python

      travisdh1T scottalanmillerS RomoR 3 Replies Last reply Reply Quote 0
      • travisdh1T
        travisdh1 @Lakshmana
        last edited by

        @lakshmana said in Python with Excel Auto Filter and Extract Data:

        I am new to Python but need to autofilter the data from the excel sheet according to the Engineer name and Age of the tickets in the excel.I need to filter the data above 15 Days and copy to the another sheet of the excel.Is this possible through Python

        Why all these oddball requests? An excel macro can do this easily and takes seconds to create.

        1 Reply Last reply Reply Quote 0
        • LakshmanaL
          Lakshmana
          last edited by

          Need to learn automation from python so trying that .Is that possible to automate through macro ?

          travisdh1T 1 Reply Last reply Reply Quote 0
          • F
            flaxking
            last edited by

            If you're working within Excel, then Python is not what you're looking for.
            If you're manipulating files, preferably csv files, then Python is your friend.

            I bet you there's a nice module that makes it easy to work with csv files, and then saving into a csv file is pretty easy.

            1 Reply Last reply Reply Quote 1
            • LakshmanaL
              Lakshmana
              last edited by

              Whether macros can be used in the .csv files also ?

              ObsolesceO 1 Reply Last reply Reply Quote 0
              • F
                flaxking
                last edited by

                I once had to work with a database that the only connection we had to it was through Telnet.

                The queries could return xml output, however it was a nested database, any queries utilizing nested relations wouldn't preserve the relationship in xml.

                So I had to capture the xml, then work with in python and create the associates, filter it how i wanted it, and then export to csv.

                The csv part was the easiest.

                1 Reply Last reply Reply Quote 0
                • travisdh1T
                  travisdh1 @Lakshmana
                  last edited by

                  @lakshmana said in Python with Excel Auto Filter and Extract Data:

                  Need to learn automation from python so trying that .Is that possible to automate through macro ?

                  Is it possible, probably. Is it the correct use of using python to automate things, then no. Try http://www.pythonforbeginners.com/code-snippets-source-code/python-code-examples

                  1 Reply Last reply Reply Quote 0
                  • ObsolesceO
                    Obsolesce @Lakshmana
                    last edited by

                    @lakshmana said in Python with Excel Auto Filter and Extract Data:

                    Whether macros can be used in the .csv files also ?

                    CSV files are just comma-separated text files, and hold no formatting or macro capabilities.

                    I haven't looked at Python about this, but I found this for PHP: https://github.com/eaglewu/phpexcel

                    1 Reply Last reply Reply Quote 1
                    • scottalanmillerS
                      scottalanmiller @Lakshmana
                      last edited by

                      @lakshmana said in Python with Excel Auto Filter and Extract Data:

                      I am new to Python but need to autofilter the data from the excel sheet according to the Engineer name and Age of the tickets in the excel.

                      Two things here then are a problem.

                      1. It is not an Excel file if it is CSV. So this changes the question completely. It's about CSV files.
                      2. You don't do this in Excel, you just use Python.

                      So the question should be "how to extract data from text file with Python". Extremely different than the question asked.

                      LakshmanaL 1 Reply Last reply Reply Quote 3
                      • LakshmanaL
                        Lakshmana @scottalanmiller
                        last edited by

                        @scottalanmiller said in Python with Excel Auto Filter and Extract Data:

                        @lakshmana said in Python with Excel Auto Filter and Extract Data:

                        I am new to Python but need to autofilter the data from the excel sheet according to the Engineer name and Age of the tickets in the excel.

                        Two things here then are a problem.

                        1. It is not an Excel file if it is CSV. So this changes the question completely. It's about CSV files.
                        2. You don't do this in Excel, you just use Python.

                        So the question should be "how to extract data from text file with Python". Extremely different than the question asked.

                        I will try with .csv file and then convert that file into excel by powershell

                        1 Reply Last reply Reply Quote 1
                        • RomoR
                          Romo @Lakshmana
                          last edited by Romo

                          @lakshmana said in Extract Data from .csv file with Python:

                          I am new to Python but need to autofilter the data from the excel sheet according to the Engineer name and Age of the tickets in the excel.I need to filter the data above 15 Days and copy to the another sheet of the excel.Is this possible through Python

                          It is indeed possible to do with python. You have great libraries built just for data analysis and manipulation for cases like this.

                          • pandas
                          • xlwings
                          • openpyxl
                          • XlsxWriter
                          RomoR LakshmanaL 2 Replies Last reply Reply Quote 0
                          • RomoR
                            Romo @Romo
                            last edited by Romo

                            Now that I have some free time, I'll give you a simple example of some of the things you can do using pandas.

                            Our dataset will be all the posts in this topic, scraped and saved into an excel file.

                            DATASET : pandas-test.xslx
                            alt text

                            I am going to be using a Jupyter notebook just to make the output clearer.

                            $ import pandas
                            $ pandas.read_excel('pandas-test.xlsx')
                            

                            alt text
                            That is the whole spreadsheet read and basically printed out, but we can't work with that. We need to read the file into a variable and start working with it.

                            # Reading the file to variable df
                            $ df = pandas.read_excel('pandas-test.xlsx')
                            
                            # Printing how many rows and column in the file (rows,columns)
                            $ df.shape
                            (11,4)
                            
                            # Printing the column names
                            $ df.columns
                            Index(['Date', 'Time ', 'User', 'Post'], dtype='object')
                            

                            Only extracting columns - Date and User

                            $ df[['Date', 'User']]
                            

                            alt text

                            Lets check how many posts per day

                            $ df.groupby(['Date'])['User'].count()
                            

                            alt text
                            Now lets check by day and time

                            $ df.groupby(['Date', 'Time ']).count()
                            

                            alt text


                            Lets filter only your posts and create a new csv file based on the data found.

                            $ subset = df.loc[df['User']=='Lakshmana']
                            

                            alt text

                            Create a csv file only containing your posts

                            $ subset.to_csv('Lakshmana-posts.csv')
                            

                            Final results your new csv file with your posts filtered out.

                            Lakshmana-posts.csv
                            alt text


                            EDIT: Just if you need it, an example of filtering by date and user
                            alt text

                            LakshmanaL 1 Reply Last reply Reply Quote 5
                            • LakshmanaL
                              Lakshmana @Romo
                              last edited by

                              @romo said in Extract Data from .csv file with Python:

                              Now that I have some free time, I'll give you a simple example of some of the things you can do using pandas.

                              Our dataset will be all the posts in this topic, scraped and saved into an excel file.

                              DATASET : pandas-test.xslx
                              alt text

                              I am going to be using a Jupyter notebook just to make the output clearer.

                              $ import pandas
                              $ pandas.read_excel('pandas-test.xlsx')
                              

                              alt text
                              That is the whole spreadsheet read and basically printed out, but we can't work with that. We need to read the file into a variable and start working with it.

                              # Reading the file to variable df
                              $ df = pandas.read_excel('pandas-test.xlsx')
                              
                              # Printing how many rows and column in the file (rows,columns)
                              $ df.shape
                              (11,4)
                              
                              # Printing the column names
                              $ df.columns
                              Index(['Date', 'Time ', 'User', 'Post'], dtype='object')
                              

                              Only extracting columns - Date and User

                              $ df[['Date', 'User']]
                              

                              alt text

                              Lets check how many posts per day

                              $ df.groupby(['Date'])['User'].count()
                              

                              alt text
                              Now lets check by day and time

                              $ df.groupby(['Date', 'Time ']).count()
                              

                              alt text


                              Lets filter only your posts and create a new csv file based on the data found.

                              $ subset = df.loc[df['User']=='Lakshmana']
                              

                              alt text

                              Create a csv file only containing your posts

                              $ subset.to_csv('Lakshmana-posts.csv')
                              

                              Final results your new csv file with your posts filtered out.

                              Lakshmana-posts.csv
                              alt text


                              EDIT: Just if you need it, an example of filtering by date and user
                              alt text

                              Good example thanks to making me understand

                              1 Reply Last reply Reply Quote 0
                              • LakshmanaL
                                Lakshmana
                                last edited by

                                @romo said in Extract Data from .csv file with Python:

                                Printing how many rows and column in the file (rows,columns)

                                $ df.shape
                                (11,4)

                                If i need select the sheet 2 in the Excel means what needs to be done ?

                                1 Reply Last reply Reply Quote 0
                                • RomoR
                                  Romo
                                  last edited by Romo

                                  You have to read the file differently, and then read the sheet you want passing the file object and then the sheet name.


                                  DATASET: pandas-test-xlsx, Sheet - Test-Sheet
                                  alt text

                                  # Read the whole file
                                  $ file = pandas.ExcelFile('pandas-test.xlsx')
                                  
                                  # Specify the file and then the sheet you want to work with
                                  $ mysheet = pandas.read_excel(file, 'Test-Sheet')
                                  
                                  # Work with the dataframe now contained in the mysheet variable
                                  $ mysheet
                                  

                                  alt text


                                  EDIT: Just doubled checked an you can pass the file name as a string to the read_excel function so that would be alot easier.

                                  pandas.read_excel('pandas-test.xlsx', 'Test-Sheet')
                                  
                                  1 Reply Last reply Reply Quote 0
                                  • LakshmanaL
                                    Lakshmana @Romo
                                    last edited by

                                    @romo said in Extract Data from .csv file with Python:

                                    @lakshmana said in Extract Data from .csv file with Python:

                                    I am new to Python but need to autofilter the data from the excel sheet according to the Engineer name and Age of the tickets in the excel.I need to filter the data above 15 Days and copy to the another sheet of the excel.Is this possible through Python

                                    It is indeed possible to do with python. You have great libraries built just for data analysis and manipulation for cases like this.

                                    • pandas
                                    • xlwings
                                    • openpyxl
                                    • XlsxWriter

                                    All the three package installed in Windows only Pandas not installing I am using Python 3.7

                                    1 Reply Last reply Reply Quote 0
                                    • RomoR
                                      Romo
                                      last edited by

                                      I usually install Anaconda in windows because I use other things included in it and it is the easiest way of having everything setup for you, but it might really be overkill for you to install it because it is pretty big.

                                      You can also install miniconda and download the required packages from it.

                                      The official documentation marks the above two methods as the easiest for beginners but you can also install it from Pypi. pip install pandas

                                      LakshmanaL 1 Reply Last reply Reply Quote 0
                                      • LakshmanaL
                                        Lakshmana @Romo
                                        last edited by

                                        @romo said in Extract Data from .csv file with Python:

                                        pip install pandas

                                        raise DistutilsError("Setup script exited with %s" % (v.args[0],))
                                        distutils.errors.DistutilsError: Setup script exited with error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools

                                        RomoR 1 Reply Last reply Reply Quote 0
                                        • RomoR
                                          Romo @Lakshmana
                                          last edited by

                                          @lakshmana said in Extract Data from .csv file with Python:

                                          @romo said in Extract Data from .csv file with Python:

                                          pip install pandas

                                          raise DistutilsError("Setup script exited with %s" % (v.args[0],))
                                          distutils.errors.DistutilsError: Setup script exited with error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools

                                          Some of the dependencies when installing it with pip are required to be compiled that is why for beginners it is just better to use either Anaconda or miniconda, especially for a Windows install.

                                          1 Reply Last reply Reply Quote 0
                                          • 1 / 1
                                          • First post
                                            Last post