WEBVTT
NOTE
duration:"00:48:03.6930000"
language:en-us
NOTE Confidence: 0.901675164699554
00:00:00.350 --> 00:00:06.380
So we're here to talk about Azure SQL, database hyperscale. How many people have heard about hyper scale.
NOTE Confidence: 0.939228117465973
00:00:07.620 --> 00:00:09.310
How many people think they know how it works?
NOTE Confidence: 0.900280475616455
00:00:10.710 --> 00:00:11.900
OK, that's why we're here.
NOTE Confidence: 0.893144488334656
00:00:13.600 --> 00:00:26.930
So to give some context. This is all part of the Azure. SQL database, the Azure. Relational database platform. It's it is Azure. SQL database, it's not a different thing. It's just a different.
NOTE Confidence: 0.846875786781311
00:00:28.030 --> 00:00:32.390
Tear performance tier of Azure SQL, database, so you've got.
NOTE Confidence: 0.84410035610199
00:00:34.860 --> 00:00:54.870
Standard and premium now that's general purpose in business critical is the V core version of of those V core model is separates compute from size. So you can buy chorus separately from the space and then hyper scale is another performance tier in between. Those 2 but it's all.
NOTE Confidence: 0.891901016235352
00:00:56.500 --> 00:01:16.510
Azure Relational database platform and you get all the goodness of the platform with it so that means you have. It's intelligent. You have intelligent protection around the database. We're looking for intrusion attempts were protecting your database from any intruders intelligent performance. We're monitoring the Telemetry that comes off the database. We don't look.
NOTE Confidence: 0.872217655181885
00:01:17.340 --> 00:01:37.350
But we look at the Telemetry that describes how things are working like looking at DMV's for weight stats and that sort of thing an will do tuning in the background. We can suggest indexes for you, if you flip the switch will make them for you and monitor whether that was again or not if it's not will pull it back pretty cool stuff and that's all.
NOTE Confidence: 0.789147138595581
00:01:58.940 --> 00:02:02.780
This workload you scale it up when that P comes in this scale, it back down.
NOTE Confidence: 0.892196476459503
00:02:03.960 --> 00:02:23.970
Resource governance to smooth out performance and make it more predictable is trusted. We have a lot of security features that are built in again. It's across the board. This is a storage platform. More than anything else. So it's orthogonal to the query performance query behavior. We have business continuity hadr.
NOTE Confidence: 0.900546312332153
00:02:24.760 --> 00:02:28.390
Platforms not something you have to worry about it's a fully managed platform as a service.
NOTE Confidence: 0.86102169752121
00:02:30.010 --> 00:02:33.470
Industry compliance certificate certifications.
NOTE Confidence: 0.904173314571381
00:02:34.670 --> 00:02:47.040
Enterprise security isolation, so we have an isolated environment for you, it's one of the most secure cloud platforms out there as far as government certifications, and all the other certifications, Azure has more.
NOTE Confidence: 0.869488179683685
00:02:48.130 --> 00:03:04.390
Certifications any other platform out there, so all of these goodness is apply across the board to Azure SQL database. SQL server on Azure. Virtual machines Azure. SQL data for my SQL, Postgres or Mariah TB. Those all live on the Azure database platform.
NOTE Confidence: 0.903411149978638
00:03:05.800 --> 00:03:07.520
So that's kind of the context of where we're at.
NOTE Confidence: 0.877653419971466
00:03:08.690 --> 00:03:13.330
And again there's a lot of deployment options, with it, you've got.
NOTE Confidence: 0.90104216337204
00:03:14.240 --> 00:03:34.250
Singleton databases, which is the Azure SQL database. We've known for a long time. You have a connection string that connects you to a single database and it's just that database. You have elastic pools, which is what you're paying for is a collection of resources so you have a defined amount of resources and you.
NOTE Confidence: 0.884597420692444
00:03:35.040 --> 00:03:55.050
Resources among a pool of databases and the idea. There is if you have a model where you have, maybe a multi tenant application and each tenant is its own database. You buy a pool of resources and count on the fact that not all of your tenants are going to spike at the same instant. They're going to spike independently so you size for the maximum aggregate.
NOTE Confidence: 0.919295072555542
00:03:55.980 --> 00:04:09.800
Adding up the peak of each possible client and you end up getting a good performance experience for all of your clients without having to pay for the maximum for each of them so it's more economical way of purchasing.
NOTE Confidence: 0.886144399642944
00:04:10.800 --> 00:04:30.810
And then managed instances, the newer offering and that's where instead of being connected to a database. You're connected to a full SQL instance. It's still platform as a service is still fully managed we can manage the platform for you, we take care of Hadr. We do backups for you restores all of that is there, but instead of connecting to a single database you connect.
NOTE Confidence: 0.903520107269287
00:04:31.600 --> 00:04:39.760
That means you have access to cross database transactions. You have access to agent jobs. All the richness of a SQL instance without the headaches of managing it.
NOTE Confidence: 0.890127241611481
00:04:41.730 --> 00:04:44.410
So those are the deployment options and.
NOTE Confidence: 0.807793796062469
00:04:46.140 --> 00:04:51.490
Hyper scale is is just another database that can be.
NOTE Confidence: 0.859278380870819
00:04:52.530 --> 00:05:09.830
Diploid in those options so right now, we have hyper scaling Singleton databases. We're working on hyper scale in managed instance an at some point will probably add it to elastic pool. If there's demand. Hyperscale's a little bit different beast because it accommodates such huge data sizes.
NOTE Confidence: 0.804479897022247
00:05:12.560 --> 00:05:15.160
So, an overscale overview of hyper scale.
NOTE Confidence: 0.931497395038605
00:05:17.900 --> 00:05:25.350
What's the problem statement here? We're dealing with very large datasets and that presents some unique challenges to databases.
NOTE Confidence: 0.861786544322968
00:05:26.440 --> 00:05:46.450
Size of data so anything you do with the database that requires you to do size of data operation gets exceedingly painful when the size of data gets gargantuan right so if you have 2050 100. Terabyte database and you need to do a point in time. Restore you're going to come back next week right.
NOTE Confidence: 0.892769992351532
00:05:47.610 --> 00:06:07.620
So operations can take a really long time and that gets very painful. And while that operation is going on, you're going to degrade performance because it's sucking up resources and can cause outages and downtime and if you do run out of space. Provisioning more storage at that scale can be really painful if you're doing.
NOTE Confidence: 0.856549561023712
00:06:08.410 --> 00:06:20.880
You're talking about rolling out new sands. William together if you're talking about CPU resources. That's not easy in that scale so there's a lot of challenges having to do with the scale that we're operating at.
NOTE Confidence: 0.849707305431366
00:06:22.220 --> 00:06:31.490
Scaling compute if you want to size your box on premise for the maximum peak. You might need in terms of course, that gets pretty spandy too.
NOTE Confidence: 0.887217164039612
00:06:33.350 --> 00:06:46.250
So what is hyper scale it's a new storage architecture fundamentally it's a new storage architecture under SQL server. So it's the same SQL server engin. We split off the storage engine from the SQL server process.
NOTE Confidence: 0.859774708747864
00:06:47.760 --> 00:07:07.770
The SQL server, the storage engine, the part that deals with physical storage. So the interface is at the page ID level. So when the query engine needs a specific page. It queries the storage engine for the page ID. We've now taken this storage storage engine out of the SQL server process and made it.
NOTE Confidence: 0.783906161785126
00:07:08.560 --> 00:07:09.910
Cal out microservices.
NOTE Confidence: 0.874890387058258
00:07:10.820 --> 00:07:18.740
That's the nut of what hyper scale is it's a new storage architecture. That was designed for the cloud for cloud scale.
NOTE Confidence: 0.877298593521118
00:07:20.680 --> 00:07:37.040
Its architect from the for the cloud. It's completely compatible with Azure. SQL database 'cause. We haven't touched the query engin so query. Performance is completely unchanged, the way you manage. It is largely unchanged, we it's a fully managed platform as a service offering.
NOTE Confidence: 0.898780405521393
00:07:39.240 --> 00:07:43.840
The whole idea here is to enable you to have VLDB operations without the headaches.
NOTE Confidence: 0.895902395248413
00:07:44.810 --> 00:07:50.920
So you have a very large database at some point, you're going to restore we aim to make that a lot less painful.
NOTE Confidence: 0.896459877490997
00:07:52.490 --> 00:08:06.980
Virtually no limits on size, so you'll see 100 terabytes quoted in the literature around. This There's nothing significant architecturally about the number 100. It's around number. It happened to be larger than the competition.
NOTE Confidence: 0.910299479961395
00:08:07.930 --> 00:08:12.540
But there's nothing that really limits US to 100 terabytes.
NOTE Confidence: 0.900288283824921
00:08:13.450 --> 00:08:15.730
It just takes a long time together that much test data.
NOTE Confidence: 0.912149429321289
00:08:18.760 --> 00:08:25.190
So that's that's what hyper scale is fundamentally that's what we're talking about here.
NOTE Confidence: 0.890066146850586
00:08:26.750 --> 00:08:46.760
So we're going to talk about some of the capabilities. What hyperscale can do for you, how it can deliver for you and then Elaine is going to go into the architecture for how we get there and how we do it. It's it. Always makes more sense to me to talk about what you're getting for it and then talk about how we do it, so you know, we're not.
NOTE Confidence: 0.87651264667511
00:08:50.860 --> 00:09:08.610
So this is Azure SQL database. It has full compatibility with the rest of all the other incarnations of Azure SQL database. No syntax changes. There's no new unique changes in queries 'cause. It's the query. Engineers completely unchanged. You just write queries it runs.
NOTE Confidence: 0.0359203480184078
00:09:09.970 --> 00:09:10.480
Um.
NOTE Confidence: 0.889389038085938
00:09:11.700 --> 00:09:20.790
Management tools are unchanged, you have the same suite of management tools for whether it's management studio, Azure Management Portal.
NOTE Confidence: 0.918525218963623
00:09:22.210 --> 00:09:24.690
All of those tools work just like they always have.
NOTE Confidence: 0.0293794516474009
00:09:26.170 --> 00:09:26.680
Um.
NOTE Confidence: 0.852734208106995
00:09:28.170 --> 00:09:39.530
Seamless development offline online access to on premise cloud design apps so all of the application framework just works. You do data imports.
NOTE Confidence: 0.886208057403564
00:09:40.460 --> 00:09:45.530
When it's available in managed instance, you'll be able to do restores into it.
NOTE Confidence: 0.0374753400683403
00:09:46.990 --> 00:09:47.600
Um.
NOTE Confidence: 0.893209457397461
00:09:48.510 --> 00:10:08.520
Reliable and available, so we have multiple levels of reliability that you'll see when we Elaine goes through the architecture. There's more moving pieces within a hyper scale databases how we get to that scale and at each level. There's redundancy to keep you from having any uh.
NOTE Confidence: 0.651854991912842
00:10:08.540 --> 00:10:10.910
Have any down time.
NOTE Confidence: 0.879996061325073
00:10:11.880 --> 00:10:20.090
No single points of failure and 4 nines availability just like the rest of Azure SQL, database, so any questions on all that.
NOTE Confidence: 0.392626494169235
00:10:22.050 --> 00:10:22.380
No.
NOTE Confidence: 0.819807171821594
00:10:26.050 --> 00:10:34.300
It's scalable so not only said scalable but you can scale quickly data size in core scale independently that's the V core model.
NOTE Confidence: 0.880160391330719
00:10:35.260 --> 00:10:38.940
You can scale a 15 database 15 terabyte database.
NOTE Confidence: 0.857094466686249
00:10:40.380 --> 00:10:55.610
In minutes to add cores or remove course will show you how that works so you're not moving any data around when you scale change from say a 2 Core 2 and 80 core compute it just changing the compute.
NOTE Confidence: 0.916410207748413
00:10:56.900 --> 00:10:59.570
And there is virtually no size of data operations.
NOTE Confidence: 0.878100872039795
00:11:02.510 --> 00:11:05.570
So I've had a lot made a lot of claims here. Let's see what some of this looks like.
NOTE Confidence: 0.883871078491211
00:11:06.490 --> 00:11:09.690
I'm going to show you how you create a hyper scale database.
NOTE Confidence: 0.820810377597809
00:11:13.020 --> 00:11:14.040
So we're here in the portal.
NOTE Confidence: 0.88882040977478
00:11:23.330 --> 00:11:25.330
My server and create a new database.
NOTE Confidence: 0.721850574016571
00:11:30.320 --> 00:11:32.380
Give it a good name gets DB.
NOTE Confidence: 0.894867897033691
00:11:33.570 --> 00:11:36.110
Start with a blank database, it's on my server.
NOTE Confidence: 0.916317403316498
00:11:58.050 --> 00:12:02.110
Somewhere in the middle closer to the business critical end for a lot of workloads but.
NOTE Confidence: 0.848150074481964
00:12:03.150 --> 00:12:12.160
That's where it lands so now we have hyper scale selected you have your choice of Gen 4 or Gen 5 choose Gen 5 with.
NOTE Confidence: 0.462021172046661
00:12:13.590 --> 00:12:14.960
To be cores.
NOTE Confidence: 0.912952959537506
00:12:16.120 --> 00:12:22.480
You can choose the number of secondary replicas. The default is one you can have zero if you want up to 4 secondary replicas.
NOTE Confidence: 0.890797436237335
00:12:23.620 --> 00:12:25.380
Is anybody notice? What is missing here?
NOTE Confidence: 0.892365515232086
00:12:27.070 --> 00:12:29.770
If you're used to creating Azure SQL databases what's missing.
NOTE Confidence: 0.830689966678619
00:12:32.160 --> 00:12:40.660
Size there is no defined Max size for hyperscale database. We will just grow it as needed, so as it fills up will just.
NOTE Confidence: 0.851569473743439
00:12:41.650 --> 00:12:47.200
Give you more page servers will tell you what those are in a minute and just expand the database question.
NOTE Confidence: 0.827501952648163
00:12:49.540 --> 00:12:50.070
Excuse me.
NOTE Confidence: 0.875049412250519
00:12:52.090 --> 00:12:58.140
So what about this cost if you're scaling this stuff. What does it cost you we bill you for the allocated space?
NOTE Confidence: 0.86956787109375
00:12:59.180 --> 00:13:19.190
So we may add in another terabytes worth of capacity when you go over a certain threshold will add it. We had capacity in terms in chunks of a chair terabyte. We bill at a minimum of 5 gigs and then at one gig increments past that so you pay for what you use within that terabyte we're giving you a terabyte but billing you?
NOTE Confidence: 0.830291092395782
00:13:19.980 --> 00:13:21.820
Portion of it that you're actually using.
NOTE Confidence: 0.91147369146347
00:13:23.650 --> 00:13:24.940
Good question anything else.
NOTE Confidence: 0.764468491077423
00:13:28.250 --> 00:13:28.660
Yes.
NOTE Confidence: 0.885738253593445
00:13:39.260 --> 00:13:43.400
So the scaling process is there an overhead to that we will actually.
NOTE Confidence: 0.889729082584381
00:13:44.340 --> 00:13:51.410
Bring in the next increment of storage before the first increment is done and you're not going to feel it at all.
NOTE Confidence: 0.64708799123764
00:13:52.480 --> 00:13:54.440
We model.
NOTE Confidence: 0.784777224063873
00:13:55.830 --> 00:13:59.460
Yeah, I'll let Elaine explain that part I'm getting ahead of myself.
NOTE Confidence: 0.862440764904022
00:14:04.990 --> 00:14:06.760
Scaling operations are fully online.
NOTE Confidence: 0.847787320613861
00:14:09.330 --> 00:14:15.140
So if you're scaling cores. We just bring in a new compute head with more cores and then do a failover.
NOTE Confidence: 0.761495530605316
00:14:20.270 --> 00:14:20.770
With what?
NOTE Confidence: 0.884081959724426
00:14:22.820 --> 00:14:25.060
What's the situation with long running queries?
NOTE Confidence: 0.88099604845047
00:14:30.450 --> 00:14:49.770
So the question is what happens when you have a long running query. You been running for 6 hours and you decide to scale down at some point during the scale operation. There's going to be a failover when we move over to the compute node that has fewer instances, so if you're middle of that transaction you're going to end up rolling back so you want to plan that accordingly.
NOTE Confidence: 0.910777747631073
00:14:50.750 --> 00:14:52.990
Fortunately, that's something you have full control over.
NOTE Confidence: 0.609007477760315
00:14:58.040 --> 00:14:58.540
OK.
NOTE Confidence: 0.729388475418091
00:15:09.960 --> 00:15:10.710
And why is it not?
NOTE Confidence: 0.953435182571411
00:15:22.390 --> 00:15:25.200
So one important thing to understand is at this point.
NOTE Confidence: 0.866031050682068
00:15:27.260 --> 00:15:30.480
Once the databases hyperscale you can't make it non hyper scale.
NOTE Confidence: 0.908198297023773
00:15:31.660 --> 00:15:44.800
You can convert a non hyperscale database into hyper scale, but you can't go back at this point because of the way we change the storage at someday. We hope to make that available but it's not on the near term. Horizon so it's just something you need to be aware of.
NOTE Confidence: 0.836550891399384
00:15:49.310 --> 00:15:54.910
And that's it. They basically will be created in the background. No need to really wait around for it.
NOTE Confidence: 0.876294910907745
00:16:02.530 --> 00:16:22.540
So we're delivering high performance low latency high throughput for large databases. Even though we're dealing in it, very large sizes. We can do transactions very quickly backups or snapshot based there's virtually zero impact to the running workload for doing a back up and we'll get into how we pull that off.
NOTE Confidence: 0.867675423622131
00:16:23.730 --> 00:16:43.740
Snapshot based and that is fully buffered from the running engine so the engine never literally never feels anything. Rapid database restore. We claim we can restore a database independent of the size of data so whether it's a one gigabyte, database or 50 terabyte database is going to take about the same time.
NOTE Confidence: 0.883832514286041
00:16:44.530 --> 00:16:46.810
Which is interesting accomplishment?
NOTE Confidence: 0.864318907260895
00:16:49.620 --> 00:16:57.840
Speaking of which how long do you think it would take to restore 50 terabyte database to a point in time restore anybody days weeks?
NOTE Confidence: 0.86038339138031
00:16:59.260 --> 00:17:00.430
8 minutes.
NOTE Confidence: 0.92088109254837
00:17:06.170 --> 00:17:11.880
So I have a video just so you don't have to wait through the whole time, you'll see a stop watch on the right side.
NOTE Confidence: 0.0379657223820686
00:17:13.050 --> 00:17:13.620
Um.
NOTE Confidence: 0.860073804855347
00:17:14.810 --> 00:17:32.470
So the stop watch is is there to give you the feel for the time so you'll see. That's a full 50. Terabytes is not just a shell that big we had 50 terabytes of data in there, so we're doing coming up with a point in time. Restore we've picked the point in time and hit go.
NOTE Confidence: 0.831303000450134
00:17:33.700 --> 00:17:35.210
Start this start the timer.
NOTE Confidence: 0.883622169494629
00:17:38.920 --> 00:17:42.810
So going to look at the server that this database lives in.
NOTE Confidence: 0.8303142786026
00:17:44.270 --> 00:17:49.450
You can see the notice from the notifications that we're doing a restore.
NOTE Confidence: 0.906500637531281
00:17:52.650 --> 00:17:59.110
See that restoring progress so we're going to sit here and watch until that database actually comes on line.
NOTE Confidence: 0.875522911548615
00:18:00.110 --> 00:18:04.420
So again this is a database that has 50 terabytes of data in it.
NOTE Confidence: 0.852398157119751
00:18:05.870 --> 00:18:10.440
I'm getting ready deposit when it it gets to that point.
NOTE Confidence: 0.867901384830475
00:18:13.340 --> 00:18:21.790
This database has I believe to course so we're running a 50 terabyte database on 2 cores, which is an interesting metric as well.
NOTE Confidence: 0.81795459985733
00:18:23.380 --> 00:18:30.190
And again, you can fluidly scale this up and down as you need too. It's not a problem.
NOTE Confidence: 0.373110294342041
00:18:32.840 --> 00:18:33.340
So.
NOTE Confidence: 0.894765555858612
00:18:34.370 --> 00:18:40.970
Just so you don't think I'm trying to pull anything over you made. The transition nice and flashy so you see where at about 7 minutes now.
NOTE Confidence: 0.884320139884949
00:18:42.120 --> 00:18:48.760
Into the restore process and you just saw the restored database pop up on the list on the bottom so down here.
NOTE Confidence: 0.845250904560089
00:18:50.650 --> 00:19:06.320
So it's popped up, it's running recovery right now, so databases been fully established instantiate e'd all the components to compute head is come up. Now we're running recovery and that database will come online fairly shortly another few seconds.
NOTE Confidence: 0.487037718296051
00:19:16.180 --> 00:19:16.890
So.
NOTE Confidence: 0.747607469558716
00:19:23.120 --> 00:19:24.980
There it's done.
NOTE Confidence: 0.883010149002075
00:19:26.010 --> 00:19:29.580
7 minutes, 50 seconds to restore 50 terabyte database to a point in time.
NOTE Confidence: 0.923727095127106
00:19:30.730 --> 00:19:33.150
I challenge you to find another technology that can do that.
NOTE Confidence: 0.846932828426361
00:19:46.270 --> 00:19:53.200
OK, another demo how we migrated database existing existing database to hyper scale.
NOTE Confidence: 0.843647718429565
00:19:59.620 --> 00:20:06.410
Server I have adventure works here Adventure Works is general purpose to core database.
NOTE Confidence: 0.870675027370453
00:20:07.670 --> 00:20:17.630
To move to transfer that database into a hyper scale database, the operation is pretty complicated. You choose hyperscale Here understand that you can't get back.
NOTE Confidence: 0.762829065322876
00:20:19.050 --> 00:20:19.820
And hit apply.
NOTE Confidence: 0.901216149330139
00:20:21.150 --> 00:20:23.330
That's it that's the whole process.
NOTE Confidence: 0.934597194194794
00:20:24.390 --> 00:20:26.250
In the background were going to be.
NOTE Confidence: 0.851532340049744
00:20:27.250 --> 00:20:32.060
Constructing the hyperscale components around that data if it's.
NOTE Confidence: 0.905986726284027
00:20:33.210 --> 00:20:45.370
General purpose will take the Azure Storage in place and just build a page servers on top of that, if it's business critical. We're going to transfer the data. So it's going to take a little longer, but it takes just a couple minutes and either case.
NOTE Confidence: 0.926200687885284
00:20:49.470 --> 00:20:52.190
I think I'm going to skip that for now.
NOTE Confidence: 0.873435914516449
00:20:53.970 --> 00:21:13.980
So at this point alone is going to take over. Thank you OK, so we're going to jump into the actual architecture. Under the cover for hyper scope and it is slightly different from sort of what people would expect to see in SQL server. Today, Ashley SQL database, so there are 4 main compose.
NOTE Confidence: 0.847358345985413
00:21:14.770 --> 00:21:34.460
Hyperscale databases there's the compute which is the standard compute that you would kind of that. You've been using all along and that has local SS DS on it, and we actually put in obx resilient buffer pool extension on those there is a log service, which actually has its own.
NOTE Confidence: 0.88830304145813
00:21:35.430 --> 00:21:55.440
Aspect to it as well, which has local attested as well. Then you have these one terabyte page. Servers again every single one of those page. Servers is actually running SQL, then we'll kind of look a bit more of that and then there's a remote storage aspect, which is where all the snapshot backups and these things happen, so that they don't.
NOTE Confidence: 0.867937207221985
00:21:56.360 --> 00:22:16.370
The performance while we do these backups in the back, it, so yeah, if we look. We have the compute the log in the page servers and the remote storage So what happens is the when you write a query that query writes the computer right into the log service, the log service will then go.
NOTE Confidence: 0.86606627702713
00:22:17.160 --> 00:22:37.170
Populate the page servers, the page servers will then sort of at a checkpoint. They will populate the remote storage and that remote storage from a cold start so if you had to start a database up so in the case of the one we just backed up, that those are made remote storage is actually.
NOTE Confidence: 0.868856966495514
00:22:37.960 --> 00:22:44.840
The the page servers and sort of all the tears up so you have these multiple layers of storage involved.
NOTE Confidence: 0.867277920246124
00:22:45.750 --> 00:23:05.760
And then the page servers, obviously attached to the computer So what we have is we have these standard computes over there. This compute server has I said it has the resilient buffer pools in them for scaling operations Kevin mentioned this we can create.
NOTE Confidence: 0.449924856424332
00:23:06.550 --> 00:23:07.700
Compute.
NOTE Confidence: 0.889529943466187
00:23:08.720 --> 00:23:28.730
Component and we can just fail over to it, so it's an online transaction. It's irrespective of size of data because the storage is separated from the computer in this case, and when their compute node comes up. It's only a case of basically refreshing the local cache. That's there, so we had this notion of constant.
NOTE Confidence: 0.879125654697418
00:23:29.520 --> 00:23:49.530
Open down in hyper scale with the readable secondaires then very different from potentially some of the other architectures that exists. We attach all of your readable second leads to the same set of data so you're not duplicating your day. You're not you don't have 3 physical copies of your data. It's one copy of data that has.
NOTE Confidence: 0.878871500492096
00:23:50.320 --> 00:24:04.480
So secondaires attached to it, and that's why that's why we're saying it. It's from an architectural standpoint, you could technically have a lot of these more than just the 3 that on this diagram.
NOTE Confidence: 0.84854918718338
00:24:05.690 --> 00:24:19.670
What this does allow you to do is through the primary node you can do read write workloads and accessing the secondaires using application intent read only you can load all of your?
NOTE Confidence: 0.871016442775726
00:24:21.000 --> 00:24:26.630
Read heavy workloads into those secondaires so they do not impact him all the right workload that you're trying to do.
NOTE Confidence: 0.850388824939728
00:24:28.090 --> 00:24:48.100
So what happens is the compute server actually writes into a log staging zone and today that log staging zone is running on top of Premium Storage. So you have about you have less than 2 and a half a millisecond latency to that.
NOTE Confidence: 0.854138493537903
00:24:48.890 --> 00:25:08.900
Improvements day is to move to Ultra SSDs, so you will have a sub millisecond latency into your log staging zone. The log staging zone is then is attached to the log service, which has its own cache and then once the transaction has been.
NOTE Confidence: 0.86893218755722
00:25:09.690 --> 00:25:29.700
Is being received in the log staging zone that log services then responsible for simultaneously replaying that into the secondaires as well as into the page service. So one thing that would be good to notice notice that as soon as that right to the staging zone completes.
NOTE Confidence: 0.818475127220154
00:25:30.490 --> 00:25:50.500
Yes, so you don't wait anything past that the next bunch of slides, but it's fine. Yet so and then you have the remote storage these for some reason these slides are a little bit in the back sort of backwards for together, but you have the page servers. Those are those one terabyte page Sir.
NOTE Confidence: 0.821702122688293
00:25:51.290 --> 00:25:57.220
Server runs what is essentially a full RVPX extension on it?
NOTE Confidence: 0.887008786201477
00:25:58.170 --> 00:26:03.460
Uncheck point you right from the page server into the remote storage.
NOTE Confidence: 0.891995370388031
00:26:05.860 --> 00:26:23.000
You right from the paid servant to their most storage and then what that allows us to do is that allows us to infinitely scale out. These page servers, so as Kevin said. I mean, we talk about 100 terabytes, but architecturally speaking you could have.
NOTE Confidence: 0.891701221466064
00:26:24.290 --> 00:26:26.880
A number of these page service.
NOTE Confidence: 0.912538707256317
00:26:27.900 --> 00:26:47.810
In these specific examples and the one thing to keep in mind is every single one of these page service is actually running SQL server, and what that allows us to do is that allows us to offload. Some other things that are particularly restrictive in the VLDB space like checkpoints in these things actually get off loaded into the page server themselves.
NOTE Confidence: 0.882485508918762
00:26:49.380 --> 00:27:09.390
If we then look at the backups. The backups are they done as snapshots of the remote storage so that's what Kevin was saying the backups had zero impact on your actual workload because they're running against that remote storage. They're not the page servers, not the compute nodes or any of these things and then those backups those snapshots can be.
NOTE Confidence: 0.713576257228851
00:27:10.730 --> 00:27:11.630
Once I.
NOTE Confidence: 0.880472898483276
00:27:12.760 --> 00:27:32.770
The log name is responsible for destaging into long-term storage for the log so that allows us to replay to point in time for the log. And if we now look very specifically at rite tires, so how does a right work in hyperscale so from the primary computing?
NOTE Confidence: 0.884909093379974
00:27:33.890 --> 00:27:53.900
Initiate a right that right pudding go to the log landing zone and like like we mentioned earlier. That's about or less than 2.5 milliseconds when we move to the alt racists that will be less than a millisecond to do that. The minute that has been written to the stage.
NOTE Confidence: 0.899199783802032
00:27:54.770 --> 00:28:14.780
That transaction is committed so then you can move on and then the log services responsible for simultaneously replaying that that event into not only the secondaires, but also into the page service and then the page servers ultimately responsible for offloading those in.
NOTE Confidence: 0.871371507644653
00:28:17.620 --> 00:28:37.630
And then as sort of the log service will offload those transactions once they complete into the long-term log storage so that's how I write works comes in limited hits the stage of the landing zone. It's considered that the transaction has been committed and then you will have that being written.
NOTE Confidence: 0.82065337896347
00:28:38.420 --> 00:28:43.090
Countries as well as into the page service if we look at Reeds so.
NOTE Confidence: 0.89466404914856
00:28:44.310 --> 00:28:59.900
These slides run in the backwards order so if you had to think about this. These servers coming up like from a cold start, where there was no compute the remote Azure. Storage is responsible for populating the RBPX on the page server right.
NOTE Confidence: 0.894627690315247
00:29:00.820 --> 00:29:05.720
The page servers are there in responsible for populating the RBPX on the compute node.
NOTE Confidence: 0.873820900917053
00:29:28.100 --> 00:29:48.110
If however, that record is not in that in the compute node. We get a cache miss in which case we will go call. The page server for that to complete that query and that will take less than 2 milliseconds to go call the page server, so you have these.
NOTE Confidence: 0.902139604091644
00:29:48.900 --> 00:30:08.910
Storage as well as varying degrees of performance right so for certain workloads where you can ensure that all your hot data is sitting in the local RV. PX all the time you will get really, really good performance out of that. If you do sort of very large. Analytics workloads, but you have to fetch data from the page Sir.
NOTE Confidence: 0.889274835586548
00:30:09.750 --> 00:30:15.900
Your performance will kind of be exactly that somewhere between general purpose and business credit.
NOTE Confidence: 0.90666925907135
00:30:19.200 --> 00:30:39.210
If we then look at how the backup restores work in this case so for every single page service, so for every terabyte of data used or you have a page server, so for every single page server. In this specific example. We have 3 page service. So imagine this was a 3 terabyte database. For every page server, we do.
NOTE Confidence: 0.885920524597168
00:30:39.420 --> 00:30:45.420
Random snapshots sort of at any given time intervals.
NOTE Confidence: 0.90674364566803
00:30:46.370 --> 00:30:52.990
We then have the log landing zone in the log service and the long-term storage for the logs.
NOTE Confidence: 0.895762622356415
00:30:54.390 --> 00:31:14.400
The backups for this scenario have no impact as we said because they happen off the remote snapshots of the most stories. So when I kick off a point in time restore here? What will happen is is we will choose the sort of the 3 nearest snapshots that we have for every single page server, so one snaps.
NOTE Confidence: 0.883072555065155
00:31:15.190 --> 00:31:17.040
The nearest one to that point in time.
NOTE Confidence: 0.904307425022125
00:31:18.500 --> 00:31:28.780
We will then go and find the Allison of the oldest transaction right from the long-term log storage to the point in time that you want to restore too.
NOTE Confidence: 0.889598906040192
00:31:29.890 --> 00:31:33.410
What we then do is in a new environment, we then?
NOTE Confidence: 0.890389204025269
00:31:34.700 --> 00:31:54.710
We don't move this is a meta data thing so this is a shallow copy of data that's why it's so fast. We then parallelize those copies across this allows us to do the constant time. So we have very predictable times it doesn't matter if this is 3 page servers or 300.
NOTE Confidence: 0.896988928318024
00:31:55.500 --> 00:32:07.520
The time is going to take you to do those restores is essentially it equates to the amount of time. It's going to take you to restore the slowest of your page service.
NOTE Confidence: 0.891927361488342
00:32:10.670 --> 00:32:17.970
The Azure storage, then repopulates the new page servers for the point in time restore.
NOTE Confidence: 0.869802176952362
00:32:19.750 --> 00:32:39.760
The log service, then replays in comes back up from the primary compute we attach that to the log and we do recovery only for that section as valid and then we attach the page servers and that whole instance that whole database, plus it's compute will come up in that constant time.
NOTE Confidence: 0.877365291118622
00:32:40.550 --> 00:32:53.440
That demo that Kevin showed you it was 8 minutes, but that was 8 minutes to do the full restore plus create a new compute VM attach it and do all of these steps in those 8 minutes.
NOTE Confidence: 0.906103730201721
00:32:54.930 --> 00:33:08.220
And that's kind of the architecture that we have for hyper skills. So very different from what we have today. You know way in business critical you have for nodes that are in an always on configuration.
NOTE Confidence: 0.900361716747284
00:33:09.650 --> 00:33:29.660
If we look I'm going to play this whole thing out so this is how hyperscale stacks up against the current deployments in the Azure platform. We have general purpose, which is all Azure. Remote storage premium storage and there, you're expecting about about 2 millisecond latency for data access with business critical you have.
NOTE Confidence: 0.87817108631134
00:33:30.450 --> 00:33:50.460
Attached is these which means you're getting data access in less than millisecond latency's with hyper scale. You kind of have this hybrid of both so depending on your workload you will either be hitting the local SS DS in the compute node, which has orbx or you will have to go.
NOTE Confidence: 0.884939610958099
00:33:51.250 --> 00:34:06.330
Each server in which case it will be business. Roughly the general purpose type of performance that you would see so you can think in highly transactional environments, where you are writing a lot of data the whole time, you will get.
NOTE Confidence: 0.908646762371063
00:34:07.370 --> 00:34:11.460
Much closer to business critical performance, then you would general purpose.
NOTE Confidence: 0.0412483997642994
00:34:12.560 --> 00:34:13.360
Um.
NOTE Confidence: 0.928714573383331
00:34:14.440 --> 00:34:25.240
Just very briefly general purpose is probably suited for I'd say like 80% of customer workloads should be fine running in general purpose right?
NOTE Confidence: 0.907095789909363
00:34:26.550 --> 00:34:46.560
If we look at business critical it was really both for those highly sort of transactional systems that have that need these consistently high irons that run through them and then with hyper scale. The system has been optimized for a very large databases, but also we CH tap as being the sort of the.
NOTE Confidence: 0.909129977226257
00:34:47.350 --> 00:34:57.490
Case, where you have a transactional workload running but you also do some type of analytics or business reporting or something on the same system.
NOTE Confidence: 0.896062970161438
00:34:59.340 --> 00:35:19.350
If we look at how the pricing for hyperscale works. This is why it's it's in the in between. General purpose and business critical remember with business critical. You are getting a Bolt in readable secondary so hyper scale is about 120 well will be.
NOTE Confidence: 0.898039817810059
00:35:20.140 --> 00:35:22.990
20% the cost of a general purpose instance.
NOTE Confidence: 0.894489645957947
00:35:24.250 --> 00:35:29.660
When you create a second replica so those readable replicas that Kevin showed us.
NOTE Confidence: 0.877271950244904
00:35:30.770 --> 00:35:50.780
All your readable secondaires will be billed at the I have pricing so they will be less the cost of the SQL license irrespective of whether you are in the SQL license for them or not so we see that hyper scale worth a readable secondary will still be cost less than a business critical tear.
NOTE Confidence: 0.90924733877182
00:35:51.570 --> 00:35:59.400
And for certain workloads you can be in the same ballpark is business critical full performance with hyper scale.
NOTE Confidence: 0.894101738929749
00:36:15.990 --> 00:36:35.440
So it's it's not supported right now, we plan on having it by the time we go GA. So it's not that it's it's not possible for us to do it's just we're getting there so absolutely. We're hoping it will be available by the time. Hyperscale goes GA. But we are working to unblock that.
NOTE Confidence: 0.914528429508209
00:36:45.670 --> 00:37:05.680
Uhm I think the some of the items that we are trying to get completed that might be issues. For some customers would be things like Geo replication. But other than that, from a surface area standpoint, you should be pretty much the same.
NOTE Confidence: 0.904743790626526
00:37:06.470 --> 00:37:26.480
Taking into consideration that if you run certain workloads like if you run. ADBCC checkdb DB on a 50 terabyte database understand what the implications of that is going to be right. So so like they are things we're looking at at optimizing in that space as well, but I know it says unsupported with.
NOTE Confidence: 0.670068919658661
00:37:27.270 --> 00:37:29.380
Have it available.
NOTE Confidence: 0.8718022108078
00:37:32.750 --> 00:37:52.760
Pricing and then this is a super important slide so this question comes up more often than it should and that is the placement of hyperscale versus Azure. DW it's actually really, really simple right hyper scale is still running an SMP engine. It is running the same.
NOTE Confidence: 0.638371050357819
00:37:53.550 --> 00:37:54.750
Unbox does today.
NOTE Confidence: 0.886945009231567
00:37:56.070 --> 00:38:09.070
DW is running MPP so it is really built for highly parallelized Big Analytics. Type of workloads. Hyper scale is not that product so if you are trying to sort of do.
NOTE Confidence: 0.914517939090729
00:38:10.330 --> 00:38:30.340
100 terabyte data warehousing kind of things and your entire workload is super read heavy and all your doing is some type of Analytics. You should probably be looking at moving to DW and not hyperscale hyper scale from what we've seen from customer feedback and sort of a lot of the interactions. We've had around this product.
NOTE Confidence: 0.898139894008636
00:38:31.130 --> 00:38:51.140
Very much falling into the operational data. Mart perspective, so that layer of data before you hit your data warehouse where you still have highly transactional work coming in, but you then off load into some type of reporting layer after that Conversely hyperscale does really good with big large up.
NOTE Confidence: 0.830592751502991
00:38:52.150 --> 00:38:56.680
Where is DW really doesn't so that's kind of the trade offs there?
NOTE Confidence: 0.89897096157074
00:38:58.410 --> 00:39:18.420
And then just sort of the last ones. Uhm we actually do have customers using this and some metal actually super successful at this and you will see that there's kind of a very similar theme that runs through the 3 customers that I'm going to speak about now pretty much the one standard thing for all of them is that they?
NOTE Confidence: 0.906230330467224
00:39:19.210 --> 00:39:39.220
This is larger than 4 Terabytes because that's the current limitation that exists today. So that was one of the big motivational reasons, but they also had these data. Mart type of workloads where they needed to do some analytics, but also had to still be able to ingest a OLTP type of workload.
NOTE Confidence: 0.855681240558624
00:39:40.180 --> 00:40:00.190
So the first customer here are they were running. OLTP data where OLTP databases as well as DW. They have then but that was an prim they have been swapped over to Azure and they now running hyper scale to cater for both of those.
NOTE Confidence: 0.888038516044617
00:40:00.980 --> 00:40:13.070
Right so they were not in the region of all we need to do is massive Analytics. Reporting workloads, but they were definitely in the zone of like we still have to do transactions here.
NOTE Confidence: 0.918097496032715
00:40:14.300 --> 00:40:34.310
The next customer that we have was they have somewhere between 50 and 100 terabytes of data so this was really the only option that they had, and instead of them sort of doing the whole traditional approach where you offload data from your data warehouse.
NOTE Confidence: 0.903227210044861
00:40:35.100 --> 00:40:55.110
Analysis services and then do your reporting off of that they do their operational reporting directly on top of hyper scale and the big reason for that is even if you look at something like DW. It has limitations around number of concurrent connections. It can take. We don't have those limitations 'cause. This is still SMP sequel, see we can do that.
NOTE Confidence: 0.91839611530304
00:40:55.480 --> 00:41:09.560
10,800 and whatever the number is it's a very high number of concurrent transactions that we can take so they went for this route where they ingest this large amount of data but they can then do real time reporting directly from that hyper scale database.
NOTE Confidence: 0.880232036113739
00:41:11.060 --> 00:41:13.240
And then the very last one.
NOTE Confidence: 0.888365685939789
00:41:14.480 --> 00:41:34.490
This is actually a this is actually a production deployment. That's already been done. They have the architecture that that I was speaking about where they run. DW's wireless hyperscale so they do their ingestion of their transactional workload into hyper scale and then they offload.
NOTE Confidence: 0.906342267990112
00:41:35.280 --> 00:41:49.250
Business objects with the business value of that data into the ADW and they do this at scale. So it's not just one hyper scale. It's lots of hyper skills ingesting data all the time and put it into a single data warehouse repository.
NOTE Confidence: 0.867648243904114
00:41:50.600 --> 00:41:53.060
And then just to kind of recap.
NOTE Confidence: 0.919636130332947
00:41:54.500 --> 00:42:07.380
The hyper scale is basically going to bring you high performance scalability and reliability right in the sense that the architecture is very different than it's been designed to be.
NOTE Confidence: 0.895473420619965
00:42:08.380 --> 00:42:28.390
Zero points zero single point of failure exists within it, you will get the performance and the scale that you're expecting for these very large databases. We allow you to grow your database. So where is instead of pre committing to a fixed size we now say you will grow at?
NOTE Confidence: 0.865382194519043
00:42:29.180 --> 00:42:49.190
We offer all the constant time functions backups restores the scale up operations and ultimately what this is going to lead cheese is going to allow you to bold on a platform that will future proof. A lot of your your data workloads so it'll be able to grow with.
NOTE Confidence: 0.91180431842804
00:42:49.980 --> 00:42:56.890
But it will also have all the latest and greatest features that you're expecting from SQL server from manage instance from SQL database all the time.
NOTE Confidence: 0.873029887676239
00:42:57.810 --> 00:43:17.820
And if you want to learn more about hyper skill. Here's some links. I don't know if they're sharing the decks for this conference, hopefully, but there is documentation online. You can go read up all about hyper scale. You can go see how the pricing works and thank you for.
NOTE Confidence: 0.873843729496002
00:43:18.610 --> 00:43:21.310
If there any questions we take them now?
NOTE Confidence: 0.725156486034393
00:43:22.300 --> 00:43:22.780
Anyone.
NOTE Confidence: 0.897265613079071
00:43:30.600 --> 00:43:42.460
Yes, so that's a very good question, we, we do well. We will allow heterogeneous deployments for manage instance, so within a single subnet for example, you could have.
NOTE Confidence: 0.878875076770782
00:43:43.500 --> 00:43:47.310
General purpose business critical and hyper scale data bases in that subnet yes.
NOTE Confidence: 0.846241652965546
00:43:51.220 --> 00:43:53.440
Oh, yes evolve forms please.
NOTE Confidence: 0.865894019603729
00:43:54.530 --> 00:43:55.260
As the other one.
NOTE Confidence: 0.809923589229584
00:43:56.830 --> 00:44:00.550
If no one has any other questions thanks for attending.