ML
    • Recent
    • Categories
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Is this server strategy reckless and/or insane?

    Scheduled Pinned Locked Moved IT Discussion
    224 Posts 12 Posters 37.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • creaytC
      creayt @scottalanmiller
      last edited by

      @scottalanmiller said in Is this server strategy reckless and/or insane?:

      @creayt said in Is this server strategy reckless and/or insane?:

      @scottalanmiller said in Is this server strategy reckless and/or insane?:

      @creayt said in Is this server strategy reckless and/or insane?:

      @scottalanmiller said in Is this server strategy reckless and/or insane?:

      @creayt said in Is this server strategy reckless and/or insane?:

      @travisdh1 said in Is this server strategy reckless and/or insane?:

      Has anyone mentioned going OBR5 instead of split arrays yet?

      Also, I'd spend the little extra money for the Pro edition of the Samsung 850 drives if you want to use commodity parts rather than Dell supplied ones.

      People did suggest OBR5, yep. The benchmarks I ran ( see the large Crystal DiskMark grid below ) made me feel like I'm going to be giving up a lot of performance for not that much additional peace of mind w/ a 5 given my set up and the ability of either server to temporarily take over duties in a pinch. My overarching goal is for most requests to be as close to perceptibly instant as possible most of the time, w/ some downtime being acceptable.

      The drives are all Pros, good tip, thanks.

      The big question is... is it performance that affects the application? Benchmarks and raw numbers don't matter all that much. What matters is how the app is impacted. That's why people are asking about the WAN and other components. Getting that kind of performance on such a small web app typically is all throw away performance. Not necessarily, but often.

      It's heavily realtime-oriented, by which I mean I'm going to be attempting to stream the presence and actions of users to other users in real time and let them see what the other is doing Google Docs style. The ability to retrieve a good handful of information from MySQL per request in as close to 0 ms as possible is very important for the effect to work correctly, hence wanting to keep the app server and database on the same machine for example. Every little ms counts.

      This is where it feels to be like MySQL was a bad choice. I don't know your details, but MySQL seems at odds with all of your other requirements.

      Can you elaborate a little bit?

      MySQL is a light use database for systems that need very light relational needs. Real time systems are almost never relational so this is generally (but I don't know your use case) the wrong architecture and if you need large, high performance relational then MySQL isn't the right choice but rather PostgreSQL in the open source world or SQL Server in the closed source one.

      Isn't MySQL what Facebook uses for all of its realtime stuff?

      1 Reply Last reply Reply Quote 0
      • scottalanmillerS
        scottalanmiller @creayt
        last edited by

        @creayt said in Is this server strategy reckless and/or insane?:

        @storageninja said in Is this server strategy reckless and/or insane?:

        @creayt said in Is this server strategy reckless and/or insane?:

        @storageninja said in Is this server strategy reckless and/or insane?:

        @creayt said in Is this server strategy reckless and/or insane?:

        Ideally more than that, but it'll be a gradual climb. Right now it's in private alpha w/ ~ 100 users and they post stuff all the time. Once I make it public I imagine the content volume will skyrocket.

        Why not use Cloud/PaaS? There are some systems where you pay by the transaction so you're not out capital for hardware that will not scale where you need to go a long time, and you will not waste money on hardware if this project goes nowhere.

        Pricing out equivalent horsepower on Amazon I think came to something like $50k a month, this whole set up cost me under $10k I believe. By the time I exhaust the capabilities of this hardware/investment, I hope, I'll be at the venture capital phase and and can redeploy into a fully cloud strategy, grinning shit-eatingly at how well that original $10k investment served me.

        Will also mention that colocation where I live is a dirt-effing-cheap $55-per-U/month.

        There are far cheaper IaaS providers than Amazon (I assume you are looking at EC2, when you should be looking at RDS if you're doing AWS). I'm partial to Softlayer these days, but to each their own.

        Deploying and managing your own infrastructure for a startup is a nightmare as if/when your product "Blows up" and goes from 100, to 100K users it will implode and crash on the weekend before you can get new hardware in and scale it, or refactor for a platform with real scalability. If your worried about cloud lock-in use abstraction systems that allow for multi-cloud strategies (although honestly in the early phase I'd just accept the lockin as that's easier to refactor than trying to refactor the platform AND scale at the same time).

        If you can't maintain growth and have large hiccups in engineering VC gets spooked easily.

        Also If you're really looking to scale one thing is trying to limit your dependency on RDMS in general. 9/10 times I see a startup using one, they should have used object storage or a No-SQL system.

        Then again, I'm just a Palo Alto Serf working for "the man" and not feeling the wind in my hair of founding the next big thing in the garage.

        Thanks Dad 🙂

        As someone architecting a high capacity real time system myself, avoiding relational and going with NoSQL systems designed for massive scale was key to our core design.

        1 Reply Last reply Reply Quote 0
        • creaytC
          creayt @scottalanmiller
          last edited by

          @scottalanmiller said in Is this server strategy reckless and/or insane?:

          @creayt said in Is this server strategy reckless and/or insane?:

          @scottalanmiller said in Is this server strategy reckless and/or insane?:

          @creayt said in Is this server strategy reckless and/or insane?:

          @scottalanmiller said in Is this server strategy reckless and/or insane?:

          @creayt said in Is this server strategy reckless and/or insane?:

          @travisdh1 said in Is this server strategy reckless and/or insane?:

          Has anyone mentioned going OBR5 instead of split arrays yet?

          Also, I'd spend the little extra money for the Pro edition of the Samsung 850 drives if you want to use commodity parts rather than Dell supplied ones.

          People did suggest OBR5, yep. The benchmarks I ran ( see the large Crystal DiskMark grid below ) made me feel like I'm going to be giving up a lot of performance for not that much additional peace of mind w/ a 5 given my set up and the ability of either server to temporarily take over duties in a pinch. My overarching goal is for most requests to be as close to perceptibly instant as possible most of the time, w/ some downtime being acceptable.

          The drives are all Pros, good tip, thanks.

          The big question is... is it performance that affects the application? Benchmarks and raw numbers don't matter all that much. What matters is how the app is impacted. That's why people are asking about the WAN and other components. Getting that kind of performance on such a small web app typically is all throw away performance. Not necessarily, but often.

          It's heavily realtime-oriented, by which I mean I'm going to be attempting to stream the presence and actions of users to other users in real time and let them see what the other is doing Google Docs style. The ability to retrieve a good handful of information from MySQL per request in as close to 0 ms as possible is very important for the effect to work correctly, hence wanting to keep the app server and database on the same machine for example. Every little ms counts.

          This is where it feels to be like MySQL was a bad choice. I don't know your details, but MySQL seems at odds with all of your other requirements.

          Can you elaborate a little bit?

          MySQL is a light use database for systems that need very light relational needs. Real time systems are almost never relational so this is generally (but I don't know your use case) the wrong architecture and if you need large, high performance relational then MySQL isn't the right choice but rather PostgreSQL in the open source world or SQL Server in the closed source one.

          I'd be more than happy to explore SQL Server if you think its performance outclasses MySQL w/ the same schemas presuming perfect indexing on both products. Wouldn't mind dropping a little more for a license. I'll look into it. Do you know of any good comparisons/benchmarks to start w/? Thanks.

          scottalanmillerS 2 Replies Last reply Reply Quote 0
          • scottalanmillerS
            scottalanmiller @creayt
            last edited by

            @creayt said in Is this server strategy reckless and/or insane?:

            @storageninja said in Is this server strategy reckless and/or insane?:

            @scottalanmiller said in Is this server strategy reckless and/or insane?:

            @creayt said in Is this server strategy reckless and/or insane?:

            @scottalanmiller said in Is this server strategy reckless and/or insane?:

            @creayt said in Is this server strategy reckless and/or insane?:

            @travisdh1 said in Is this server strategy reckless and/or insane?:

            Has anyone mentioned going OBR5 instead of split arrays yet?

            Also, I'd spend the little extra money for the Pro edition of the Samsung 850 drives if you want to use commodity parts rather than Dell supplied ones.

            People did suggest OBR5, yep. The benchmarks I ran ( see the large Crystal DiskMark grid below ) made me feel like I'm going to be giving up a lot of performance for not that much additional peace of mind w/ a 5 given my set up and the ability of either server to temporarily take over duties in a pinch. My overarching goal is for most requests to be as close to perceptibly instant as possible most of the time, w/ some downtime being acceptable.

            The drives are all Pros, good tip, thanks.

            The big question is... is it performance that affects the application? Benchmarks and raw numbers don't matter all that much. What matters is how the app is impacted. That's why people are asking about the WAN and other components. Getting that kind of performance on such a small web app typically is all throw away performance. Not necessarily, but often.

            It's heavily realtime-oriented, by which I mean I'm going to be attempting to stream the presence and actions of users to other users in real time and let them see what the other is doing Google Docs style. The ability to retrieve a good handful of information from MySQL per request in as close to 0 ms as possible is very important for the effect to work correctly, hence wanting to keep the app server and database on the same machine for example. Every little ms counts.

            This is where it feels to be like MySQL was a bad choice. I don't know your details, but MySQL seems at odds with all of your other requirements.

            Yah, you don't need consistency isn't needed and mySQL will scale like shit no matter how much you shard it.

            Really? What scale does that come into play on? Just ran a sample query against a table on my local system running MySQL w/ no tuning ( out of the box developer install -- low ram ), table has 77 million plus rows and a few simple indexes, and returns what I want in .001 seconds. I've always found MySQL to be super duper duper fast when your schema is well-designed and your queries are strategically indexed.

            How many nodes are you handling in how many geographic locations with how much failover? MySQL isn't designed for use in a large scale system at all, I'm not even sure how to do it. We used it at Change.org and it was a beast to keep working when other things like Cassandra blew its doors off and took far less effort.

            Rows per table isn't a good indication of what will hit you when you grow. When you are dealing with hundreds of thousands of users all hitting at once, load balancing, resiliency, sharding and other factors are what will determine your ability to serve requests.

            creaytC 1 Reply Last reply Reply Quote 2
            • scottalanmillerS
              scottalanmiller @creayt
              last edited by

              @creayt said in Is this server strategy reckless and/or insane?:

              @scottalanmiller said in Is this server strategy reckless and/or insane?:

              @creayt said in Is this server strategy reckless and/or insane?:

              @scottalanmiller said in Is this server strategy reckless and/or insane?:

              @creayt said in Is this server strategy reckless and/or insane?:

              @scottalanmiller said in Is this server strategy reckless and/or insane?:

              @creayt said in Is this server strategy reckless and/or insane?:

              @travisdh1 said in Is this server strategy reckless and/or insane?:

              Has anyone mentioned going OBR5 instead of split arrays yet?

              Also, I'd spend the little extra money for the Pro edition of the Samsung 850 drives if you want to use commodity parts rather than Dell supplied ones.

              People did suggest OBR5, yep. The benchmarks I ran ( see the large Crystal DiskMark grid below ) made me feel like I'm going to be giving up a lot of performance for not that much additional peace of mind w/ a 5 given my set up and the ability of either server to temporarily take over duties in a pinch. My overarching goal is for most requests to be as close to perceptibly instant as possible most of the time, w/ some downtime being acceptable.

              The drives are all Pros, good tip, thanks.

              The big question is... is it performance that affects the application? Benchmarks and raw numbers don't matter all that much. What matters is how the app is impacted. That's why people are asking about the WAN and other components. Getting that kind of performance on such a small web app typically is all throw away performance. Not necessarily, but often.

              It's heavily realtime-oriented, by which I mean I'm going to be attempting to stream the presence and actions of users to other users in real time and let them see what the other is doing Google Docs style. The ability to retrieve a good handful of information from MySQL per request in as close to 0 ms as possible is very important for the effect to work correctly, hence wanting to keep the app server and database on the same machine for example. Every little ms counts.

              This is where it feels to be like MySQL was a bad choice. I don't know your details, but MySQL seems at odds with all of your other requirements.

              Can you elaborate a little bit?

              MySQL is a light use database for systems that need very light relational needs. Real time systems are almost never relational so this is generally (but I don't know your use case) the wrong architecture and if you need large, high performance relational then MySQL isn't the right choice but rather PostgreSQL in the open source world or SQL Server in the closed source one.

              I'd be more than happy to explore SQL Server if you think its performance outclasses MySQL w/ the same schemas presuming perfect indexing on both products. Wouldn't mind dropping a little more for a license. I'll look into it. Do you know of any good comparisons/benchmarks to start w/? Thanks.

              I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.

              creaytC 1 Reply Last reply Reply Quote 0
              • creaytC
                creayt @scottalanmiller
                last edited by

                @scottalanmiller said in Is this server strategy reckless and/or insane?:

                @creayt said in Is this server strategy reckless and/or insane?:

                @storageninja said in Is this server strategy reckless and/or insane?:

                @scottalanmiller said in Is this server strategy reckless and/or insane?:

                @creayt said in Is this server strategy reckless and/or insane?:

                @scottalanmiller said in Is this server strategy reckless and/or insane?:

                @creayt said in Is this server strategy reckless and/or insane?:

                @travisdh1 said in Is this server strategy reckless and/or insane?:

                Has anyone mentioned going OBR5 instead of split arrays yet?

                Also, I'd spend the little extra money for the Pro edition of the Samsung 850 drives if you want to use commodity parts rather than Dell supplied ones.

                People did suggest OBR5, yep. The benchmarks I ran ( see the large Crystal DiskMark grid below ) made me feel like I'm going to be giving up a lot of performance for not that much additional peace of mind w/ a 5 given my set up and the ability of either server to temporarily take over duties in a pinch. My overarching goal is for most requests to be as close to perceptibly instant as possible most of the time, w/ some downtime being acceptable.

                The drives are all Pros, good tip, thanks.

                The big question is... is it performance that affects the application? Benchmarks and raw numbers don't matter all that much. What matters is how the app is impacted. That's why people are asking about the WAN and other components. Getting that kind of performance on such a small web app typically is all throw away performance. Not necessarily, but often.

                It's heavily realtime-oriented, by which I mean I'm going to be attempting to stream the presence and actions of users to other users in real time and let them see what the other is doing Google Docs style. The ability to retrieve a good handful of information from MySQL per request in as close to 0 ms as possible is very important for the effect to work correctly, hence wanting to keep the app server and database on the same machine for example. Every little ms counts.

                This is where it feels to be like MySQL was a bad choice. I don't know your details, but MySQL seems at odds with all of your other requirements.

                Yah, you don't need consistency isn't needed and mySQL will scale like shit no matter how much you shard it.

                Really? What scale does that come into play on? Just ran a sample query against a table on my local system running MySQL w/ no tuning ( out of the box developer install -- low ram ), table has 77 million plus rows and a few simple indexes, and returns what I want in .001 seconds. I've always found MySQL to be super duper duper fast when your schema is well-designed and your queries are strategically indexed.

                How many nodes are you handling in how many geographic locations with how much failover? MySQL isn't designed for use in a large scale system at all, I'm not even sure how to do it. We used it at Change.org and it was a beast to keep working when other things like Cassandra blew its doors off and took far less effort.

                Rows per table isn't a good indication of what will hit you when you grow. When you are dealing with hundreds of thousands of users all hitting at once, load balancing, resiliency, sharding and other factors are what will determine your ability to serve requests.

                https://www.mysql.com/products/cluster/

                1 Reply Last reply Reply Quote 0
                • creaytC
                  creayt @scottalanmiller
                  last edited by

                  @scottalanmiller said in Is this server strategy reckless and/or insane?:

                  I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.

                  But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.

                  scottalanmillerS 1 Reply Last reply Reply Quote 0
                  • scottalanmillerS
                    scottalanmiller @creayt
                    last edited by

                    @creayt said in Is this server strategy reckless and/or insane?:

                    I'd be more than happy to explore SQL Server if you think its performance outclasses MySQL w/ the same schemas presuming perfect indexing on both products. Wouldn't mind dropping a little more for a license.

                    That could easily double your costs, your total costs. You'll need WIndows licenses and SQL Server licenses. It would be a crazy and the cost would grow as you grew.

                    1 Reply Last reply Reply Quote 0
                    • scottalanmillerS
                      scottalanmiller @creayt
                      last edited by

                      @creayt said in Is this server strategy reckless and/or insane?:

                      @scottalanmiller said in Is this server strategy reckless and/or insane?:

                      I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.

                      But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.

                      What about it makes it relational? Is it financial data?

                      creaytC 1 Reply Last reply Reply Quote 0
                      • creaytC
                        creayt @scottalanmiller
                        last edited by creayt

                        @scottalanmiller said in Is this server strategy reckless and/or insane?:

                        @creayt said in Is this server strategy reckless and/or insane?:

                        @scottalanmiller said in Is this server strategy reckless and/or insane?:

                        I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.

                        But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.

                        What about it makes it relational? Is it financial data?

                        It's people interacting w/ public web content as intermingle-able groups, having cross-pollinating conversations about it, relating each conversation, participant, tag, and content piece to each other, classifying it in personal and group contexts for future relation, and using various analytical algorithms, eventually AI, to analyze the relationships between the different data at each tier in the hierarchy and use it as a suggestion engine to expose users to new groups, conversations, content, and other users they'll like, in a nutshell.

                        scottalanmillerS 1 Reply Last reply Reply Quote 0
                        • scottalanmillerS
                          scottalanmiller @creayt
                          last edited by

                          @creayt said in Is this server strategy reckless and/or insane?:

                          @scottalanmiller said in Is this server strategy reckless and/or insane?:

                          @creayt said in Is this server strategy reckless and/or insane?:

                          @scottalanmiller said in Is this server strategy reckless and/or insane?:

                          I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.

                          But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.

                          What about it makes it relational? Is it financial data?

                          It's people interacting w/ public web content as intermingle-able groups, having cross-pollinating conversations about it, relating each conversation, participant, tag, and content piece to each other, classifying it in personal and group contexts for future relation, and using various analytical algorithms, eventually AI, to analyze the relationships between the different data at each tier in the hierarchy and use it as a suggestion engine to expose users to new groups, conversations, content, and other users, in a nutshell.

                          That's like textbook NoSQL target content there. Conversations, groups, tagging, analytics.... it's like the "who's who" of NoSQL target topics.

                          creaytC 1 Reply Last reply Reply Quote 0
                          • scottalanmillerS
                            scottalanmiller
                            last edited by

                            You are describing tasks often handled by engines like Hadoop, ElasticSearch, Cassandra, MongoDB, etc.

                            1 Reply Last reply Reply Quote 1
                            • creaytC
                              creayt @scottalanmiller
                              last edited by

                              @scottalanmiller said in Is this server strategy reckless and/or insane?:

                              @creayt said in Is this server strategy reckless and/or insane?:

                              @scottalanmiller said in Is this server strategy reckless and/or insane?:

                              @creayt said in Is this server strategy reckless and/or insane?:

                              @scottalanmiller said in Is this server strategy reckless and/or insane?:

                              I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.

                              But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.

                              What about it makes it relational? Is it financial data?

                              It's people interacting w/ public web content as intermingle-able groups, having cross-pollinating conversations about it, relating each conversation, participant, tag, and content piece to each other, classifying it in personal and group contexts for future relation, and using various analytical algorithms, eventually AI, to analyze the relationships between the different data at each tier in the hierarchy and use it as a suggestion engine to expose users to new groups, conversations, content, and other users, in a nutshell.

                              That's like textbook NoSQL target content there. Conversations, groups, tagging, analytics.... it's like the "who's who" of NoSQL target topics.

                              Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.

                              As far as hardware, how would what I've described so far work for going w/ NoSQL instead of MySQL? Anything you'd change specifically?

                              scottalanmillerS 2 Replies Last reply Reply Quote 0
                              • scottalanmillerS
                                scottalanmiller @creayt
                                last edited by

                                @creayt said in Is this server strategy reckless and/or insane?:

                                @scottalanmiller said in Is this server strategy reckless and/or insane?:

                                @creayt said in Is this server strategy reckless and/or insane?:

                                @scottalanmiller said in Is this server strategy reckless and/or insane?:

                                @creayt said in Is this server strategy reckless and/or insane?:

                                @scottalanmiller said in Is this server strategy reckless and/or insane?:

                                I would never, ever use SQL Server for something like this. I was only pointing out that in the closed source world that it is faster. PostgreSQL is a WAY better choice. But none of them seem like good choices for your project as they are all relational and relational will be a major problem.

                                But all of my data is relational, the nature of the project is relational, I don't know how I could even do it w/ NoSQL unless I just duplicated IDs/data everywhere.

                                What about it makes it relational? Is it financial data?

                                It's people interacting w/ public web content as intermingle-able groups, having cross-pollinating conversations about it, relating each conversation, participant, tag, and content piece to each other, classifying it in personal and group contexts for future relation, and using various analytical algorithms, eventually AI, to analyze the relationships between the different data at each tier in the hierarchy and use it as a suggestion engine to expose users to new groups, conversations, content, and other users, in a nutshell.

                                That's like textbook NoSQL target content there. Conversations, groups, tagging, analytics.... it's like the "who's who" of NoSQL target topics.

                                Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.

                                As far as hardware, how would what I've described so far work for going w/ NoSQL instead of MySQL? Anything you'd change specifically?

                                Not really (change) as speed is speed. Databases don't change that much one from another. They all like RAM, IOPS and other things the same. What IS different about a lot of NoSQL is that, and keep in mind this has nothing to do with being NoSQL vs. relational but just product commonalities, is that NoSQL clusters tend to be 3+ nodes and relational clusters tend to be pairs.

                                creaytC 1 Reply Last reply Reply Quote 1
                                • scottalanmillerS
                                  scottalanmiller
                                  last edited by

                                  BTW, we are posting on a system that handles everything on the NoSQL MongoDB platform.

                                  1 Reply Last reply Reply Quote 2
                                  • scottalanmillerS
                                    scottalanmiller @creayt
                                    last edited by

                                    @creayt said in Is this server strategy reckless and/or insane?:

                                    Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.

                                    Until ~10 years ago, RDBMS were so dominant that it was just "how everything was done." But as SaaS started to explode, the need for growth and performance change needs and NoSQL systems started to take off. They are really where the bulk of new stuff goes today, at least of big commercial stuff. SaaS vendors outside of financial use them for nearly everything. They are what power things like Google, Facebook, Change and other large websites that have to handle insane levels of data all over the world.

                                    creaytC 1 Reply Last reply Reply Quote 1
                                    • creaytC
                                      creayt @scottalanmiller
                                      last edited by creayt

                                      @scottalanmiller said in Is this server strategy reckless and/or insane?:

                                      @creayt said in Is this server strategy reckless and/or insane?:

                                      Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.

                                      Until ~10 years ago, RDBMS were so dominant that it was just "how everything was done." But as SaaS started to explode, the need for growth and performance change needs and NoSQL systems started to take off. They are really where the bulk of new stuff goes today, at least of big commercial stuff. SaaS vendors outside of financial use them for nearly everything. They are what power things like Google, Facebook, Change and other large websites that have to handle insane levels of data all over the world.

                                      Have you found any interesting sources talking about what Facebook uses NoSQL for? Here's a recent article from one of their lead DB engineers talking about how they primarily use MySQL for what sounds like most of the persistent stuff that needs to scale to large numbers of users ( mentions shares, comments, and likes explicitly ). Apparently they've written their own storage engine for MySQL which dominates InnoDB and actively maintain their own branch of MySQL itself, which was last committed to 2 hours ago.

                                      https://code.facebook.com/posts/190251048047090/myrocks-a-space-and-write-optimized-mysql-database/

                                      scottalanmillerS 1 Reply Last reply Reply Quote 0
                                      • creaytC
                                        creayt @scottalanmiller
                                        last edited by creayt

                                        @scottalanmiller

                                        In the article I linked to, dude says this: "There are many reasons why we use MySQL at Facebook. MySQL is amenable to automation, making it easy for a small team to manage thousands of MySQL servers..."

                                        Gulp. Thousands. Of. Nodes. Those guys.

                                        scottalanmillerS 1 Reply Last reply Reply Quote 0
                                        • scottalanmillerS
                                          scottalanmiller @creayt
                                          last edited by

                                          @creayt said in Is this server strategy reckless and/or insane?:

                                          @scottalanmiller said in Is this server strategy reckless and/or insane?:

                                          @creayt said in Is this server strategy reckless and/or insane?:

                                          Interesting, I'd never heard that before and RDBMS has been so great for any use case I've hit so far that I'd kind of written off NoSQL as being extraneous in any project I've needed a db for. Will look into it, thank you.

                                          Until ~10 years ago, RDBMS were so dominant that it was just "how everything was done." But as SaaS started to explode, the need for growth and performance change needs and NoSQL systems started to take off. They are really where the bulk of new stuff goes today, at least of big commercial stuff. SaaS vendors outside of financial use them for nearly everything. They are what power things like Google, Facebook, Change and other large websites that have to handle insane levels of data all over the world.

                                          Have you found any interesting sources talking about what Facebook uses NoSQL for? Here's a recent article from one of their lead DB engineers talking about how they primarily use MySQL for what sounds like most of the persistent stuff that needs to scale to large numbers of users ( mentions shares, comments, and likes explicitly ). Apparently they've written their own storage engine for MySQL which dominates InnoDB and actively maintain their own branch of MySQL itself, which was last committed to 2 hours ago.

                                          https://code.facebook.com/posts/190251048047090/myrocks-a-space-and-write-optimized-mysql-database/

                                          That's a weird article. I'm not sure how much I'd trust that, even those it is hosted on Facebook, it doesn't feel logical. And doesn't match anything we see anywhere else. It sounds like, from how they describe it, it's one small piece used for isolated processes. But even in what they describe, it's not how you are picturing it. They are using a NoSQL database that is just managed by MySQL. MySQL itself is a management platform, not a database. Rocks is their database and that is non-relational. So nothing they are talking about there applies to you. That they manage it via MySQL is interesting, but not useful in your case.

                                          Generally, though, Hadoop and Cassandra are what is behind Facebook's main services.

                                          https://stackoverflow.com/questions/1113381/what-databases-do-the-world-wide-webs-biggest-sites-run-on

                                          1 Reply Last reply Reply Quote 1
                                          • scottalanmillerS
                                            scottalanmiller @creayt
                                            last edited by

                                            @creayt said in Is this server strategy reckless and/or insane?:

                                            @scottalanmiller

                                            In the article I linked to, dude says this: "There are many reasons why we use MySQL at Facebook. MySQL is amenable to automation, making it easy for a small team to manage thousands of MySQL servers..."

                                            Gulp. Thousands. Of. Nodes. Those guys.

                                            This is the NoSQL behind the scenes of what they are using.

                                            http://leveldb.org/

                                            1 Reply Last reply Reply Quote 0
                                            • 1
                                            • 2
                                            • 7
                                            • 8
                                            • 9
                                            • 10
                                            • 11
                                            • 12
                                            • 9 / 12
                                            • First post
                                              Last post