| 1 | LWN.NET WEEKLY EDITION FOR AUGUST 30, 2018 \r |
| 2 | \r |
| 3 | \r |
| 4 | \r |
| 5 | o Reference: 0000763252\r |
| 6 | o News link: https://lwn.net/Articles/763252/\r |
| 7 | o Source link: \r |
| 8 | \r |
| 9 | \r |
| 10 | [1]Welcome to the LWN.net Weekly Edition for August 30, 2018\r |
| 11 | This edition contains the following feature content:\r |
| 12 | \r |
| 13 | [2]An introduction to the Julia language, part 1 : Julia is a\r |
| 14 | language designed for intensive numerical calculations; this\r |
| 15 | article gives an overview of its core features.\r |
| 16 | \r |
| 17 | [3]C considered dangerous : a Linux Security Summit talk on\r |
| 18 | what is being done to make the use of C in the kernel safer.\r |
| 19 | \r |
| 20 | [4]The second half of the 4.19 merge window : the final\r |
| 21 | features merged (or not merged) before the merge window closed\r |
| 22 | for this cycle.\r |
| 23 | \r |
| 24 | [5]Measuring (and fixing) I/O-controller throughput loss : the\r |
| 25 | kernel's I/O controllers can provide useful bandwidth\r |
| 26 | guarantees, but at a significant cost in throughput.\r |
| 27 | \r |
| 28 | [6]KDE's onboarding initiative, one year later : what has gone\r |
| 29 | right in KDE's effort to make it easier for contributors to\r |
| 30 | join the project, and what remains to be done.\r |
| 31 | \r |
| 32 | [7]Sharing and archiving data sets with Dat : an innovative\r |
| 33 | approach to addressing and sharing data on the net.\r |
| 34 | \r |
| 35 | This week's edition also includes these inner pages:\r |
| 36 | \r |
| 37 | [8]Brief items : Brief news items from throughout the\r |
| 38 | community.\r |
| 39 | \r |
| 40 | [9]Announcements : Newsletters, conferences, security updates,\r |
| 41 | patches, and more.\r |
| 42 | \r |
| 43 | Please enjoy this week's edition, and, as always, thank you\r |
| 44 | for supporting LWN.net.\r |
| 45 | \r |
| 46 | [10]Comments (none posted)\r |
| 47 | \r |
| 48 | [11]An introduction to the Julia language, part 1\r |
| 49 | \r |
| 50 | August 28, 2018\r |
| 51 | \r |
| 52 | This article was contributed by Lee Phillips\r |
| 53 | \r |
| 54 | [12]Julia is a young computer language aimed at serving the\r |
| 55 | needs of scientists, engineers, and other practitioners of\r |
| 56 | numerically intensive programming. It was first publicly\r |
| 57 | released in 2012. After an intense period of language\r |
| 58 | development, version 1.0 was [13]released on August 8. The 1.0\r |
| 59 | release promises years of language stability; users can be\r |
| 60 | confident that developments in the 1.x series will not break\r |
| 61 | their code. This is the first part of a two-part article\r |
| 62 | introducing the world of Julia. This part will introduce\r |
| 63 | enough of the language syntax and constructs to allow you to\r |
| 64 | begin to write simple programs. The following installment will\r |
| 65 | acquaint you with the additional pieces needed to create real\r |
| 66 | projects, and to make use of Julia's ecosystem.\r |
| 67 | \r |
| 68 | Goals and history\r |
| 69 | \r |
| 70 | The Julia project has ambitious goals. It wants the language\r |
| 71 | to perform about as well as Fortran or C when running\r |
| 72 | numerical algorithms, while remaining as pleasant to program\r |
| 73 | in as Python. I believe the project has met these goals and is\r |
| 74 | poised to see increasing adoption by numerical researchers,\r |
| 75 | especially now that an official, stable release is available.\r |
| 76 | \r |
| 77 | The Julia project maintains a [14]micro-benchmark page that\r |
| 78 | compares its numerical performance against both statically\r |
| 79 | compiled languages (C, Fortran) and dynamically typed\r |
| 80 | languages (R, Python). While it's certainly possible to argue\r |
| 81 | about the relevance and fairness of particular benchmarks, the\r |
| 82 | data overall supports the Julia team's contention that Julia\r |
| 83 | has generally achieved parity with Fortran and C; the\r |
| 84 | benchmark source code is available.\r |
| 85 | \r |
| 86 | Julia began as research in computer science at MIT; its\r |
| 87 | creators are Alan Edelman, Stefan Karpinski, Jeff Bezanson,\r |
| 88 | and Viral Shah. These four remain active developers of the\r |
| 89 | language. They, along with Keno Fischer, co-founder and CTO of\r |
| 90 | [15]Julia Computing , were kind enough to share their thoughts\r |
| 91 | with us about the language. I'll be drawing on their comments\r |
| 92 | later on; for now, let's get a taste of what Julia code looks\r |
| 93 | like.\r |
| 94 | \r |
| 95 | Getting started\r |
| 96 | \r |
| 97 | To explore Julia initially, start up its standard\r |
| 98 | [16]read-eval-print loop (REPL) by typing julia at the\r |
| 99 | terminal, assuming that you have installed it. You will then\r |
| 100 | be able to interact with what will seem to be an interpreted\r |
| 101 | language — but, behind the scenes, those commands are being\r |
| 102 | compiled by a just-in-time (JIT) compiler that uses the\r |
| 103 | [17]LLVM compiler framework . This allows Julia to be\r |
| 104 | interactive, while turning the code into fast, native machine\r |
| 105 | instructions. However, the JIT compiler passes sometimes\r |
| 106 | introduce noticeable delays at the REPL, especially when using\r |
| 107 | a function for the first time.\r |
| 108 | \r |
| 109 | To run a Julia program non-interactively, execute a command\r |
| 110 | like: $ julia script.jl\r |
| 111 | \r |
| 112 | Julia has all the usual data structures: numbers of various\r |
| 113 | types (including complex and rational numbers),\r |
| 114 | multidimensional arrays, dictionaries, strings, and\r |
| 115 | characters. Functions are first-class: they can be passed as\r |
| 116 | arguments to other functions, can be members of arrays, and so\r |
| 117 | on.\r |
| 118 | \r |
| 119 | Julia embraces Unicode. Strings, which are enclosed in double\r |
| 120 | quotes, are arrays of Unicode characters, which are enclosed\r |
| 121 | in single quotes. The " * " operator is used for string and\r |
| 122 | character concatenation. Thus 'a' and 'β' are characters, and\r |
| 123 | 'aβ' is a syntax error. "a" and "β" are strings, as are "aβ",\r |
| 124 | 'a' * 'β', and "a" * "β" — all evaluate to the same string.\r |
| 125 | \r |
| 126 | Variable and function names can contain non-ASCII characters.\r |
| 127 | This, along with Julia's clever syntax that understands\r |
| 128 | numbers prepended to variables to mean multiplication, goes a\r |
| 129 | long way to allowing the numerical scientist to write code\r |
| 130 | that more closely resembles the compact mathematical notation\r |
| 131 | of the equations that usually lie behind it. julia ε₁ = 0.01\r |
| 132 | \r |
| 133 | 0.01\r |
| 134 | \r |
| 135 | julia ε₂ = 0.02\r |
| 136 | \r |
| 137 | 0.02\r |
| 138 | \r |
| 139 | julia 2ε₁ + 3ε₂\r |
| 140 | \r |
| 141 | 0.08\r |
| 142 | \r |
| 143 | And where does Julia come down on the age-old debate of what\r |
| 144 | do about 1/2 ? In Fortran and Python 2, this will get you 0,\r |
| 145 | since 1 and 2 are integers, and the result is rounded down to\r |
| 146 | the integer 0. This was deemed inconsistent, and confusing to\r |
| 147 | some, so it was changed in Python 3 to return 0.5 — which is\r |
| 148 | what you get in Julia, too.\r |
| 149 | \r |
| 150 | While we're on the subject of fractions, Julia can handle\r |
| 151 | rational numbers, with a special syntax: 3//5 + 2//3 returns\r |
| 152 | 19//15 , while 3/5 + 2/3 gets you the floating-point answer\r |
| 153 | 1.2666666666666666. Internally, Julia thinks of a rational\r |
| 154 | number in its reduced form, so the expression 6//8 == 3//4\r |
| 155 | returns true , and numerator(6//8) returns 3 .\r |
| 156 | \r |
| 157 | Arrays\r |
| 158 | \r |
| 159 | Arrays are enclosed in square brackets and indexed with an\r |
| 160 | iterator that can contain a step value: julia a = [1, 2, 3,\r |
| 161 | 4, 5, 6]\r |
| 162 | \r |
| 163 | 6-element Array{Int64,1}:\r |
| 164 | \r |
| 165 | 1\r |
| 166 | \r |
| 167 | 2\r |
| 168 | \r |
| 169 | 3\r |
| 170 | \r |
| 171 | 4\r |
| 172 | \r |
| 173 | 5\r |
| 174 | \r |
| 175 | 6\r |
| 176 | \r |
| 177 | julia a[1:2:end]\r |
| 178 | \r |
| 179 | 3-element Array{Int64,1}:\r |
| 180 | \r |
| 181 | 1\r |
| 182 | \r |
| 183 | 3\r |
| 184 | \r |
| 185 | 5\r |
| 186 | \r |
| 187 | As you can see, indexing starts at one, and the useful end\r |
| 188 | index means the obvious thing. When you define a variable in\r |
| 189 | the REPL, Julia replies with the type and value of the\r |
| 190 | assigned data; you can suppress this output by ending your\r |
| 191 | input line with a semicolon.\r |
| 192 | \r |
| 193 | Since arrays are such a vital part of numerical computation,\r |
| 194 | and Julia makes them easy to work with, we'll spend a bit more\r |
| 195 | time with them than the other data structures.\r |
| 196 | \r |
| 197 | To illustrate the syntax, we can start with a couple of 2D\r |
| 198 | arrays, defined at the REPL: julia a = [1 2 3; 4 5 6]\r |
| 199 | \r |
| 200 | 2×3 Array{Int64,2}:\r |
| 201 | \r |
| 202 | 1 2 3\r |
| 203 | \r |
| 204 | 4 5 6\r |
| 205 | \r |
| 206 | julia z = [-1 -2 -3; -4 -5 -6];\r |
| 207 | \r |
| 208 | Indexing is as expected: julia a[1, 2]\r |
| 209 | \r |
| 210 | 2\r |
| 211 | \r |
| 212 | You can glue arrays together horizontally: julia [a z]\r |
| 213 | \r |
| 214 | 2×6 Array{Int64,2}:\r |
| 215 | \r |
| 216 | 1 2 3 -1 -2 -3\r |
| 217 | \r |
| 218 | 4 5 6 -4 -5 -6\r |
| 219 | \r |
| 220 | And vertically: julia [a; z]\r |
| 221 | \r |
| 222 | 4×3 Array{Int64,2}:\r |
| 223 | \r |
| 224 | 1 2 3\r |
| 225 | \r |
| 226 | 4 5 6\r |
| 227 | \r |
| 228 | -1 -2 -3\r |
| 229 | \r |
| 230 | -4 -5 -6\r |
| 231 | \r |
| 232 | Julia has all the usual operators for handling arrays, and\r |
| 233 | [18]linear algebra functions that work with matrices (2D\r |
| 234 | arrays). The linear algebra functions are part of Julia's\r |
| 235 | standard library, but need to be imported with a command like\r |
| 236 | " using LinearAlgebra ", which is a detail omitted from the\r |
| 237 | current documentation. The functions include such things as\r |
| 238 | determinants, matrix inverses, eigenvalues and eigenvectors,\r |
| 239 | many kinds of matrix factorizations, etc. Julia has not\r |
| 240 | reinvented the wheel here, but wisely uses the [19]LAPACK\r |
| 241 | Fortran library of battle-tested linear algebra routines.\r |
| 242 | \r |
| 243 | The extension of arithmetic operators to arrays is usually\r |
| 244 | intuitive: julia a + z\r |
| 245 | \r |
| 246 | 2×3 Array{Int64,2}:\r |
| 247 | \r |
| 248 | 0 0 0\r |
| 249 | \r |
| 250 | 0 0 0\r |
| 251 | \r |
| 252 | And the numerical prepending syntax works with arrays, too:\r |
| 253 | julia 3a + 4z\r |
| 254 | \r |
| 255 | 2×3 Array{Int64,2}:\r |
| 256 | \r |
| 257 | -1 -2 -3\r |
| 258 | \r |
| 259 | -4 -5 -6\r |
| 260 | \r |
| 261 | Putting a multiplication operator between two matrices gets\r |
| 262 | you matrix multiplication: julia a * transpose(a)\r |
| 263 | \r |
| 264 | 2×2 Array{Int64,2}:\r |
| 265 | \r |
| 266 | 14 32\r |
| 267 | \r |
| 268 | 32 77\r |
| 269 | \r |
| 270 | You can "broadcast" numbers to cover all the elements in an\r |
| 271 | array by prepending the usual arithmetic operators with a dot:\r |
| 272 | julia 1 .+ a\r |
| 273 | \r |
| 274 | 2×3 Array{Int64,2}:\r |
| 275 | \r |
| 276 | 2 3 4\r |
| 277 | \r |
| 278 | 5 6 7\r |
| 279 | \r |
| 280 | Note that the language only actually requires the dot for some\r |
| 281 | operators, but not for others, such as "*" and "/". The\r |
| 282 | reasons for this are arcane, and it probably makes sense to be\r |
| 283 | consistent and use the dot whenever you intend broadcasting.\r |
| 284 | Note also that the current version of the official\r |
| 285 | documentation is incorrect in claiming that you may omit the\r |
| 286 | dot from "+" and "-"; in fact, this now gives an error.\r |
| 287 | \r |
| 288 | You can use the dot notation to turn any function into one\r |
| 289 | that operates on each element of an array: julia\r |
| 290 | round.(sin.([0, π/2, π, 3π/2, 2π]))\r |
| 291 | \r |
| 292 | 5-element Array{Float64,1}:\r |
| 293 | \r |
| 294 | 0.0\r |
| 295 | \r |
| 296 | 1.0\r |
| 297 | \r |
| 298 | 0.0\r |
| 299 | \r |
| 300 | -1.0\r |
| 301 | \r |
| 302 | -0.0\r |
| 303 | \r |
| 304 | The example above illustrates chaining two dotted functions\r |
| 305 | together. The Julia compiler turns expressions like this into\r |
| 306 | "fused" operations: instead of applying each function in turn\r |
| 307 | to create a new array that is passed to the next function, the\r |
| 308 | compiler combines the functions into a single compound\r |
| 309 | function that is applied once over the array, creating a\r |
| 310 | significant optimization.\r |
| 311 | \r |
| 312 | You can use this dot notation with any function, including\r |
| 313 | your own, to turn it into a version that operates element-wise\r |
| 314 | over arrays.\r |
| 315 | \r |
| 316 | Dictionaries (associative arrays) can be defined with several\r |
| 317 | syntaxes. Here's one: julia d1 = Dict("A"=1, "B"=2)\r |
| 318 | \r |
| 319 | Dict{String,Int64} with 2 entries:\r |
| 320 | \r |
| 321 | "B" = 2\r |
| 322 | \r |
| 323 | "A" = 1\r |
| 324 | \r |
| 325 | You may have noticed that the code snippets so far have not\r |
| 326 | included any type declarations. Every value in Julia has a\r |
| 327 | type, but the compiler will infer types if they are not\r |
| 328 | specified. It is generally not necessary to declare types for\r |
| 329 | performance, but type declarations sometimes serve other\r |
| 330 | purposes, that we'll return to later. Julia has a deep and\r |
| 331 | sophisticated type system, including user-defined types and\r |
| 332 | C-like structs. Types can have behaviors associated with them,\r |
| 333 | and can inherit behaviors from other types. The best thing\r |
| 334 | about Julia's type system is that you can ignore it entirely,\r |
| 335 | use just a few pieces of it, or spend weeks studying its\r |
| 336 | design.\r |
| 337 | \r |
| 338 | Control flow\r |
| 339 | \r |
| 340 | Julia code is organized in blocks, which can indicate control\r |
| 341 | flow, function definitions, and other code units. Blocks are\r |
| 342 | terminated with the end keyword, and indentation is not\r |
| 343 | significant. Statements are separated either with newlines or\r |
| 344 | semicolons.\r |
| 345 | \r |
| 346 | Julia has the typical control flow constructs; here is a while\r |
| 347 | block: julia i = 1;\r |
| 348 | \r |
| 349 | julia while i 5\r |
| 350 | \r |
| 351 | print(i)\r |
| 352 | \r |
| 353 | global i = i + 1\r |
| 354 | \r |
| 355 | end\r |
| 356 | \r |
| 357 | 1234\r |
| 358 | \r |
| 359 | Notice the global keyword. Most blocks in Julia introduce a\r |
| 360 | local scope for variables; without this keyword here, we would\r |
| 361 | get an error about an undefined variable.\r |
| 362 | \r |
| 363 | Julia has the usual if statements and for loops that use the\r |
| 364 | same iterators that we introduced above for array indexing. We\r |
| 365 | can also iterate over collections: julia for i ∈ ['a', 'b',\r |
| 366 | 'c']\r |
| 367 | \r |
| 368 | println(i)\r |
| 369 | \r |
| 370 | end\r |
| 371 | \r |
| 372 | a\r |
| 373 | \r |
| 374 | b\r |
| 375 | \r |
| 376 | c\r |
| 377 | \r |
| 378 | In place of the fancy math symbol in this for loop, we can use\r |
| 379 | " = " or " in ". If you want to use the math symbol but have\r |
| 380 | no convenient way to type it, the REPL will help you: type "\r |
| 381 | \in " and the TAB key, and the symbol appears; you can type\r |
| 382 | many [20]LaTeX expressions into the REPL in this way.\r |
| 383 | \r |
| 384 | Development of Julia\r |
| 385 | \r |
| 386 | The language is developed on GitHub, with over 700\r |
| 387 | contributors. The Julia team mentioned in their email to us\r |
| 388 | that the decision to use GitHub has been particularly good for\r |
| 389 | Julia, as it streamlined the process for many of their\r |
| 390 | contributors, who are scientists or domain experts in various\r |
| 391 | fields, rather than professional software developers.\r |
| 392 | \r |
| 393 | The creators of Julia have [21]published [PDF] a detailed\r |
| 394 | “mission statement” for the language, describing their aims\r |
| 395 | and motivations. A key issue that they wanted their language\r |
| 396 | to solve is what they called the "two-language problem." This\r |
| 397 | situation is familiar to anyone who has used Python or another\r |
| 398 | dynamic language on a demanding numerical problem. To get good\r |
| 399 | performance, you will wind up rewriting the numerically\r |
| 400 | intensive parts of the program in C or Fortran, dealing with\r |
| 401 | the interface between the two languages, and may still be\r |
| 402 | disappointed in the overhead presented by calling the foreign\r |
| 403 | routines from your original code.\r |
| 404 | \r |
| 405 | For Python, [22]NumPy and SciPy wrap many numerical routines,\r |
| 406 | written in Fortran or C, for efficient use from that language,\r |
| 407 | but you can only take advantage of this if your calculation\r |
| 408 | fits the pattern of an available routine; in more general\r |
| 409 | cases, where you will have to write a loop over your data, you\r |
| 410 | are stuck with Python's native performance, which is orders of\r |
| 411 | magnitude slower. If you switch to an alternative, faster\r |
| 412 | implementation of Python, such as [23]PyPy , the numerical\r |
| 413 | libraries may not be compatible; NumPy became available for\r |
| 414 | PyPy only within about the past year.\r |
| 415 | \r |
| 416 | Julia solves the two-language problem by being as expressive\r |
| 417 | and simple to program in as a dynamic scripting language,\r |
| 418 | while having the native performance of a static, compiled\r |
| 419 | language. There is no need to write numerical libraries in a\r |
| 420 | second language, but C or Fortran library routines can be\r |
| 421 | called using a facility that Julia has built-in. Other\r |
| 422 | languages, such as [24]Python or [25]R , can also interoperate\r |
| 423 | easily with Julia using external packages.\r |
| 424 | \r |
| 425 | Documentation\r |
| 426 | \r |
| 427 | There are many resources to turn to to learn the language.\r |
| 428 | There is an extensive and detailed [26]manual at Julia\r |
| 429 | headquarters, and this may be a good place to start. However,\r |
| 430 | although the first few chapters provide a gentle introduction,\r |
| 431 | the material soon becomes dense and, at times, hard to follow,\r |
| 432 | with references to concepts that are not explained until later\r |
| 433 | chapters. Fortunately, there is a [27]"learning" link at the\r |
| 434 | top of the Julia home page, which takes you to a long list of\r |
| 435 | videos, tutorials, books, articles, and classes both about\r |
| 436 | Julia and that use Julia in teaching subjects such a numerical\r |
| 437 | analysis. There is also a fairly good [28]cheat-sheet [PDF] ,\r |
| 438 | which was just updated for v. 1.0.\r |
| 439 | \r |
| 440 | If you're coming from Python, [29]this list of noteworthy\r |
| 441 | differences between Python and Julia syntax will probably be\r |
| 442 | useful.\r |
| 443 | \r |
| 444 | Some of the linked tutorials are in the form of [30]Jupyter\r |
| 445 | notebooks — indeed, the name "Jupyter" is formed from "Julia",\r |
| 446 | "Python", and "R", which are the three original languages\r |
| 447 | supported by the interface. The [31]Julia kernel for Jupyter\r |
| 448 | was recently upgraded to support v. 1.0. Judicious sampling of\r |
| 449 | a variety of documentation sources, combined with liberal\r |
| 450 | experimentation, may be the best way of learning the language.\r |
| 451 | Jupyter makes this experimentation more inviting for those who\r |
| 452 | enjoy the web-based interface, but the REPL that comes with\r |
| 453 | Julia helps a great deal in this regard by providing, for\r |
| 454 | instance, TAB completion and an extensive help system invoked\r |
| 455 | by simply pressing the "?" key.\r |
| 456 | \r |
| 457 | Stay tuned\r |
| 458 | \r |
| 459 | The [32]next installment in this two-part series will explain\r |
| 460 | how Julia is organized around the concept of "multiple\r |
| 461 | dispatch". You will learn how to create functions and make\r |
| 462 | elementary use of Julia's type system. We'll see how to\r |
| 463 | install packages and use modules, and how to make graphs.\r |
| 464 | Finally, Part 2 will briefly survey the important topics of\r |
| 465 | macros and distributed computing.\r |
| 466 | \r |
| 467 | [33]Comments (80 posted)\r |
| 468 | \r |
| 469 | [34]C considered dangerous\r |
| 470 | \r |
| 471 | By Jake Edge\r |
| 472 | \r |
| 473 | August 29, 2018\r |
| 474 | \r |
| 475 | [35]LSS NA\r |
| 476 | \r |
| 477 | At the North America edition of the [36]2018 Linux Security\r |
| 478 | Summit (LSS NA), which was held in late August in Vancouver,\r |
| 479 | Canada, Kees Cook gave a presentation on some of the dangers\r |
| 480 | that come with programs written in C. In particular, of\r |
| 481 | course, the Linux kernel is mostly written in C, which means\r |
| 482 | that the security of our systems rests on a somewhat dangerous\r |
| 483 | foundation. But there are things that can be done to help firm\r |
| 484 | things up by " Making C Less Dangerous " as the title of his\r |
| 485 | talk suggested.\r |
| 486 | \r |
| 487 | He began with a brief summary of the work that he and others\r |
| 488 | are doing as part of the [37]Kernel Self Protection Project\r |
| 489 | (KSPP). The goal of the project is to get kernel protections\r |
| 490 | merged into the mainline. These protections are not targeted\r |
| 491 | at protecting user-space processes from other (possibly rogue)\r |
| 492 | processes, but are, instead, focused on protecting the kernel\r |
| 493 | from user-space code. There are around 12 organizations and\r |
| 494 | ten individuals working on roughly 20 different technologies\r |
| 495 | as part of the KSPP, he said. The progress has been "slow and\r |
| 496 | steady", he said, which is how he thinks it should go. [38]\r |
| 497 | \r |
| 498 | One of the main problems is that C is treated mostly like a\r |
| 499 | fancy assembler. The kernel developers do this because they\r |
| 500 | want the kernel to be as fast and as small as possible. There\r |
| 501 | are other reasons, too, such as the need to do\r |
| 502 | architecture-specific tasks that lack a C API (e.g. setting up\r |
| 503 | page tables, switching to 64-bit mode).\r |
| 504 | \r |
| 505 | But there is lots of undefined behavior in C. This\r |
| 506 | "operational baggage" can lead to various problems. In\r |
| 507 | addition, C has a weak standard library with multiple utility\r |
| 508 | functions that have various pitfalls. In C, the content of\r |
| 509 | uninitialized automatic variables is undefined, but in the\r |
| 510 | machine code that it gets translated to, the value is whatever\r |
| 511 | happened to be in that memory location before. In C, a\r |
| 512 | function pointer can be called even if the type of the pointer\r |
| 513 | does not match the type of the function being called—assembly\r |
| 514 | doesn't care, it just jumps to a location, he said.\r |
| 515 | \r |
| 516 | The APIs in the standard library are also bad in many cases.\r |
| 517 | He asked: why is there no argument to memcpy() to specify the\r |
| 518 | maximum destination length? He noted a recent [39]blog post\r |
| 519 | from Raph Levien entitled "With Undefined Behavior, Anything\r |
| 520 | is Possible". That obviously resonated with Cook, as he\r |
| 521 | pointed out his T-shirt—with the title and artwork from the\r |
| 522 | post.\r |
| 523 | \r |
| 524 | Less danger\r |
| 525 | \r |
| 526 | He then moved on to some things that kernel developers can do\r |
| 527 | (and are doing) to get away from some of the dangers of C. He\r |
| 528 | began with variable-length arrays (VLAs), which can be used to\r |
| 529 | overflow the stack to access data outside of its region. Even\r |
| 530 | if the stack has a guard page, VLAs can be used to jump past\r |
| 531 | it to write into other memory, which can then be used by some\r |
| 532 | other kind of attack. The C language is "perfectly fine with\r |
| 533 | this". It is easy to find uses of VLAs with the -Wvla flag,\r |
| 534 | however.\r |
| 535 | \r |
| 536 | But it turns out that VLAs are [40]not just bad from a\r |
| 537 | security perspective , they are also slow. In a\r |
| 538 | micro-benchmark associated with a [41]patch removing a VLA , a\r |
| 539 | 13% performance boost came from using a fixed-size array. He\r |
| 540 | dug in a bit further and found that much more code is being\r |
| 541 | generated to handle a VLA, which explains the speed increase.\r |
| 542 | Since Linus Torvalds has [42]declared that VLAs should be\r |
| 543 | removed from the kernel because they cause security problems\r |
| 544 | and also slow the kernel down; Cook said "don't use VLAs".\r |
| 545 | \r |
| 546 | Another problem area is switch statements, in particular where\r |
| 547 | there is no break for a case . That could mean that the\r |
| 548 | programmer expects and wants to fall through to the next case\r |
| 549 | or it could be that the break was simply forgotten. There is a\r |
| 550 | way to get a warning from the compiler for fall-throughs, but\r |
| 551 | there needs to be a way to mark those that are truly meant to\r |
| 552 | be that way. A special fall-through "statement" in the form of\r |
| 553 | a comment is what has been agreed on within the\r |
| 554 | static-analysis community. He and others have been going\r |
| 555 | through each of the places where there is no break to add\r |
| 556 | these comments (or a break ); they have "found a lot of bugs\r |
| 557 | this way", he said.\r |
| 558 | \r |
| 559 | Uninitialized local variables will generate a warning, but not\r |
| 560 | if the variable is passed in by reference. There are some GCC\r |
| 561 | plugins that will automatically initialize these variables,\r |
| 562 | but there are also patches for both GCC and Clang to provide a\r |
| 563 | compiler option to do so. Neither of those is upstream yet,\r |
| 564 | but Torvalds has praised the effort so the kernel would likely\r |
| 565 | use the option. An interesting side effect that came about\r |
| 566 | while investigating this was a warning he got about\r |
| 567 | unreachable code when he enabled the auto-initialization.\r |
| 568 | There were two variables declared just after a switch (and\r |
| 569 | outside of any case ), where they would never be reached.\r |
| 570 | \r |
| 571 | Arithmetic overflow is another undefined behavior in C that\r |
| 572 | can cause various problems. GCC can check for signed overflow,\r |
| 573 | which performs well (the overhead is in the noise, he said),\r |
| 574 | but adding warning messages for it does grow the kernel by 6%;\r |
| 575 | making the overflow abort, instead, only adds 0.1%. Clang can\r |
| 576 | check for both signed and unsigned overflow; signed overflow\r |
| 577 | is undefined, while unsigned overflow is defined, but often\r |
| 578 | unexpected. Marking places where unsigned overflow is expected\r |
| 579 | is needed; it would be nice to get those annotations put into\r |
| 580 | the kernel, Cook said.\r |
| 581 | \r |
| 582 | Explicit bounds checking is expensive. Doing it for\r |
| 583 | copy_{to,from}_user() is a less than 1% performance hit, but\r |
| 584 | adding it to the strcpy() and memcpy() families are around a\r |
| 585 | 2% hit. Pre-Meltdown that would have been a totally impossible\r |
| 586 | performance regression for security, he said; post-Meltdown,\r |
| 587 | since it is less than 5%, maybe there is a chance to add this\r |
| 588 | checking.\r |
| 589 | \r |
| 590 | Better APIs would help as well. He pointed to the evolution of\r |
| 591 | strcpy() , through str n cpy() and str l cpy() (each with\r |
| 592 | their own bounds flaws) to str s cpy() , which seems to be "OK\r |
| 593 | so far". He also mentioned memcpy() again as a poor API with\r |
| 594 | respect to bounds checking.\r |
| 595 | \r |
| 596 | Hardware support for bounds checking is available in the\r |
| 597 | application data integrity (ADI) feature for SPARC and is\r |
| 598 | coming for Arm; it may also be available for Intel processors\r |
| 599 | at some point. These all use a form of "memory tagging", where\r |
| 600 | allocations get a tag that is stored in the high-order byte of\r |
| 601 | the address. An offset from the address can be checked by the\r |
| 602 | hardware to see if it still falls within the allocated region\r |
| 603 | based on the tag.\r |
| 604 | \r |
| 605 | Control-flow integrity (CFI) has become more of an issue\r |
| 606 | lately because much of what attackers had used in the past has\r |
| 607 | been marked as "no execute" so they are turning to using\r |
| 608 | existing code "gadgets" already present in the kernel by\r |
| 609 | hijacking existing indirect function calls. In C, you can just\r |
| 610 | call pointers without regard to the type as it just treats\r |
| 611 | them as an address to jump to. Clang has a CFI-sanitize\r |
| 612 | feature that enforces the function prototype to restrict the\r |
| 613 | calls that can be made. It is done at runtime and is not\r |
| 614 | perfect, in part because there are lots of functions in the\r |
| 615 | kernel that take one unsigned long parameter and return an\r |
| 616 | unsigned long.\r |
| 617 | \r |
| 618 | Attacks on CFI have both a "forward edge", which is what CFI\r |
| 619 | sanitize tries to handle, and a "backward edge" that comes\r |
| 620 | from manipulating the stack values, the return address in\r |
| 621 | particular. Clang has two methods available to prevent the\r |
| 622 | stack manipulation. The first is the "safe stack", which puts\r |
| 623 | various important items (e.g. "safe" variables, register\r |
| 624 | spills, and the return address) on a separate stack.\r |
| 625 | Alternatively, the "shadow stack" feature creates a separate\r |
| 626 | stack just for return addresses.\r |
| 627 | \r |
| 628 | One problem with these other stacks is that they are still\r |
| 629 | writable, so if an attacker can find them in memory, they can\r |
| 630 | still perform their attacks. Hardware-based protections, like\r |
| 631 | Intel's Control-Flow Enforcement Technology (CET),\r |
| 632 | [43]provides a read-only shadow call stack for return\r |
| 633 | addresses. Another hardware protection is [44]pointer\r |
| 634 | authentication for Arm, which adds a kind of encrypted tag to\r |
| 635 | the return address that can be verified before it is used.\r |
| 636 | \r |
| 637 | Status and challenges\r |
| 638 | \r |
| 639 | Cook then went through the current status of handling these\r |
| 640 | different problems in the kernel. VLAs are almost completely\r |
| 641 | gone, he said, just a few remain in the crypto subsystem; he\r |
| 642 | hopes those VLAs will be gone by 4.20 (or whatever the number\r |
| 643 | of the next kernel release turns out to be). Once that\r |
| 644 | happens, he plans to turn on -Wvla for the kernel build so\r |
| 645 | that none creep back in.\r |
| 646 | \r |
| 647 | There has been steady progress made on marking fall-through\r |
| 648 | cases in switch statements. Only 745 remain to be handled of\r |
| 649 | the 2311 that existed when this work started; each one\r |
| 650 | requires scrutiny to determine what the author's intent is.\r |
| 651 | Auto-initialized local variables can be done using compiler\r |
| 652 | plugins, but that is "not quite what we want", he said. More\r |
| 653 | compiler support would be helpful there. For arithmetic\r |
| 654 | overflow, it would be nice to see GCC get support for the\r |
| 655 | unsigned case, but memory allocations are now doing explicit\r |
| 656 | overflow checking at this point.\r |
| 657 | \r |
| 658 | Bounds checking has seen some "crying about performance hits",\r |
| 659 | so we are waiting impatiently for hardware support, he said.\r |
| 660 | CFI forward-edge protection needs [45]link-time optimization\r |
| 661 | (LTO) support for Clang in the kernel, but it is currently\r |
| 662 | working on Android. For backward-edge mitigation, the Clang\r |
| 663 | shadow call stack is working on Android, but we are\r |
| 664 | impatiently waiting for hardware support for that too.\r |
| 665 | \r |
| 666 | There are a number of challenges in doing security development\r |
| 667 | for the kernel, Cook said. There are cultural boundaries due\r |
| 668 | to conservatism within the kernel community; that requires\r |
| 669 | patiently working and reworking features in order to get them\r |
| 670 | upstream. There are, of course, technical challenges because\r |
| 671 | of the complexity of security changes; those kinds of problems\r |
| 672 | can be solved. There are also resource limitations in terms of\r |
| 673 | developers, testers, reviewers, and so on. KSPP and the other\r |
| 674 | kernel security developers are still making that "slow but\r |
| 675 | steady" progress.\r |
| 676 | \r |
| 677 | Cook's [46]slides [PDF] are available for interested readers;\r |
| 678 | before long, there should be a video available of the talk as\r |
| 679 | well.\r |
| 680 | \r |
| 681 | [I would like to thank LWN's travel sponsor, the Linux\r |
| 682 | Foundation, for travel assistance to attend the Linux Security\r |
| 683 | Summit in Vancouver.]\r |
| 684 | \r |
| 685 | [47]Comments (70 posted)\r |
| 686 | \r |
| 687 | [48]The second half of the 4.19 merge window\r |
| 688 | \r |
| 689 | By Jonathan Corbet\r |
| 690 | \r |
| 691 | August 26, 2018 By the time Linus Torvalds [49]released\r |
| 692 | 4.19-rc1 and closed the merge window for this development\r |
| 693 | cycle, 12,317 non-merge changesets had found their way into\r |
| 694 | the mainline; about 4,800 of those landed after [50]last\r |
| 695 | week's summary was written. As tends to be the case late in\r |
| 696 | the merge window, many of those changes were fixes for the\r |
| 697 | bigger patches that went in early, but there were also a\r |
| 698 | number of new features added. Some of the more significant\r |
| 699 | changes include:\r |
| 700 | \r |
| 701 | Core kernel\r |
| 702 | \r |
| 703 | The full set of patches adding [51]control-group awareness to\r |
| 704 | the out-of-memory killer has not been merged due to ongoing\r |
| 705 | disagreements, but one piece of it has: there is a new\r |
| 706 | memory.oom.group control knob that will cause all processes\r |
| 707 | within a control group to be killed in an out-of-memory\r |
| 708 | situation.\r |
| 709 | \r |
| 710 | A new set of protections has been added to prevent an attacker\r |
| 711 | from fooling a program into writing to an existing file or\r |
| 712 | FIFO. An open with the O_CREAT flag to a file or FIFO in a\r |
| 713 | world-writable, sticky directory (e.g. /tmp ) will fail if the\r |
| 714 | owner of the opening process is not the owner of either the\r |
| 715 | target file or the containing directory. This behavior,\r |
| 716 | disabled by default, is controlled by the new\r |
| 717 | protected_regular and protected_fifos sysctl knobs.\r |
| 718 | \r |
| 719 | Filesystems and block layer\r |
| 720 | \r |
| 721 | The dm-integrity device-mapper target can now use a separate\r |
| 722 | device for metadata storage.\r |
| 723 | \r |
| 724 | EROFS, the "enhanced read-only filesystem", has been added to\r |
| 725 | the staging tree. It is " a lightweight read-only file system\r |
| 726 | with modern designs (eg. page-sized blocks, inline\r |
| 727 | xattrs/data, etc.) for scenarios which need high-performance\r |
| 728 | read-only requirements, eg. firmwares in mobile phone or\r |
| 729 | LIVECDs "\r |
| 730 | \r |
| 731 | The new "metadata copy-up" feature in overlayfs will avoid\r |
| 732 | copying a file's contents to the upper layer on a\r |
| 733 | metadata-only change. See [52]this commit for details.\r |
| 734 | \r |
| 735 | Hardware support\r |
| 736 | \r |
| 737 | Graphics : Qualcomm Adreno A6xx GPUs.\r |
| 738 | \r |
| 739 | Industrial I/O : Spreadtrum SC27xx series PMIC\r |
| 740 | analog-to-digital converters, Analog Devices AD5758\r |
| 741 | digital-to-analog converters, Intersil ISL29501 time-of-flight\r |
| 742 | sensors, Silicon Labs SI1133 UV index/ambient light sensor\r |
| 743 | chips, and Bosch Sensortec BME680 sensors.\r |
| 744 | \r |
| 745 | Miscellaneous : Generic ADC-based resistive touchscreens,\r |
| 746 | Generic ASIC devices via the Google [53]Gasket framework ,\r |
| 747 | Analog Devices ADGS1408/ADGS1409 multiplexers, Actions Semi\r |
| 748 | Owl SoCs DMA controllers, MEN 16Z069 watchdog timers, Rohm\r |
| 749 | BU21029 touchscreen controllers, Cirrus Logic CS47L35,\r |
| 750 | CS47L85, CS47L90, and CS47L91 codecs, Cougar 500k gaming\r |
| 751 | keyboards, Qualcomm GENI-based I2C controllers, Actions\r |
| 752 | Semiconductor Owl I2C controllers, ChromeOS EC-based USBPD\r |
| 753 | chargers, and Analog Devices ADP5061 battery chargers.\r |
| 754 | \r |
| 755 | USB : Nuvoton NPCM7XX on-chip EHCI USB controllers, Broadcom\r |
| 756 | Stingray PCIe PHYs, and Renesas R-Car generation 3 PCIe PHYs.\r |
| 757 | \r |
| 758 | There is also a new subsystem for the abstraction of GNSS\r |
| 759 | (global navigation satellite systems — GPS, for example)\r |
| 760 | receivers in the kernel. To date, such devices have been\r |
| 761 | handled with an abundance of user-space drivers; the hope is\r |
| 762 | to bring some order in this area. Support for u-blox and\r |
| 763 | SiRFstar receivers has been added as well.\r |
| 764 | \r |
| 765 | Kernel internal\r |
| 766 | \r |
| 767 | The __deprecated marker, used to mark interfaces that should\r |
| 768 | no longer be used, has been deprecated and removed from the\r |
| 769 | kernel entirely. [54]Torvalds said : " They are not useful.\r |
| 770 | They annoy everybody, and nobody ever does anything about\r |
| 771 | them, because it's always 'somebody elses problem'. And when\r |
| 772 | people start thinking that warnings are normal, they stop\r |
| 773 | looking at them, and the real warnings that mean something go\r |
| 774 | unnoticed. "\r |
| 775 | \r |
| 776 | The minimum version of GCC required by the kernel has been\r |
| 777 | moved up to 4.6.\r |
| 778 | \r |
| 779 | There are a couple of significant changes that failed to get\r |
| 780 | in this time around, including the [55]XArray data structure.\r |
| 781 | The patches are thought to be ready, but they had the bad luck\r |
| 782 | to be based on a tree that failed to be merged for other\r |
| 783 | reasons, so Torvalds [56]didn't even look at them . That, in\r |
| 784 | turn, blocks another set of patches intended to enable\r |
| 785 | migration of slab-allocated objects.\r |
| 786 | \r |
| 787 | The other big deferral is the [57]new system-call API for\r |
| 788 | filesystem mounting . Despite ongoing [58]concerns about what\r |
| 789 | happens when the same low-level device is mounted multiple\r |
| 790 | times with conflicting options, Al Viro sent [59]a pull\r |
| 791 | request to send this work upstream. The ensuing discussion\r |
| 792 | made it clear that there is still not a consensus in this\r |
| 793 | area, though, so it seems that this work has to wait for\r |
| 794 | another cycle.\r |
| 795 | \r |
| 796 | Assuming all goes well, the kernel will stabilize over the\r |
| 797 | coming weeks and the final 4.19 release will happen in\r |
| 798 | mid-October.\r |
| 799 | \r |
| 800 | [60]Comments (1 posted)\r |
| 801 | \r |
| 802 | [61]Measuring (and fixing) I/O-controller throughput loss\r |
| 803 | \r |
| 804 | August 29, 2018\r |
| 805 | \r |
| 806 | This article was contributed by Paolo Valente\r |
| 807 | \r |
| 808 | Many services, from web hosting and video streaming to cloud\r |
| 809 | storage, need to move data to and from storage. They also\r |
| 810 | often require that each per-client I/O flow be guaranteed a\r |
| 811 | non-zero amount of bandwidth and a bounded latency. An\r |
| 812 | expensive way to provide these guarantees is to over-provision\r |
| 813 | storage resources, keeping each resource underutilized, and\r |
| 814 | thus have plenty of bandwidth available for the few I/O flows\r |
| 815 | dispatched to each medium. Alternatively one can use an I/O\r |
| 816 | controller. Linux provides two mechanisms designed to throttle\r |
| 817 | some I/O streams to allow others to meet their bandwidth and\r |
| 818 | latency requirements. These mechanisms work, but they come at\r |
| 819 | a cost: a loss of as much as 80% of total available I/O\r |
| 820 | bandwidth. I have run some tests to demonstrate this problem;\r |
| 821 | some upcoming improvements to the [62]bfq I/O scheduler\r |
| 822 | promise to improve the situation considerably.\r |
| 823 | \r |
| 824 | Throttling does guarantee control, even on drives that happen\r |
| 825 | to be highly utilized but, as will be seen, it has a hard time\r |
| 826 | actually ensuring that drives are highly utilized. Even with\r |
| 827 | greedy I/O flows, throttling easily ends up utilizing as\r |
| 828 | little as 20% of the available speed of a flash-based drive.\r |
| 829 | Such a speed loss may be particularly problematic with\r |
| 830 | lower-end storage. On the opposite end, it is also\r |
| 831 | disappointing with high-end hardware, as the Linux block I/O\r |
| 832 | stack itself has been [63]redesigned from the ground up to\r |
| 833 | fully utilize the high speed of modern, fast storage. In\r |
| 834 | addition, throttling fails to guarantee the expected\r |
| 835 | bandwidths if I/O contains both reads and writes, or is\r |
| 836 | sporadic in nature.\r |
| 837 | \r |
| 838 | On the bright side, there now seems to be an effective\r |
| 839 | alternative for controlling I/O: the proportional-share policy\r |
| 840 | provided by the bfq I/O scheduler. It enables nearly 100%\r |
| 841 | storage bandwidth utilization, at least with some of the\r |
| 842 | workloads that are problematic for throttling. An upcoming\r |
| 843 | version of bfq may be able to achieve this result with almost\r |
| 844 | all workloads. Finally, bfq guarantees bandwidths with all\r |
| 845 | workloads. The current limitation of bfq is that its execution\r |
| 846 | overhead becomes significant at speeds above 400,000 I/O\r |
| 847 | operations per second on commodity CPUs.\r |
| 848 | \r |
| 849 | Using the bfq I/O scheduler, Linux can now guarantee low\r |
| 850 | latency to lightweight flows containing sporadic, short I/O.\r |
| 851 | No throughput issues arise, and no configuration is required.\r |
| 852 | This capability benefits important, time-sensitive tasks, such\r |
| 853 | as video or audio streaming, as well as executing commands or\r |
| 854 | starting applications. Although benchmarks are not available\r |
| 855 | yet, these guarantees might also be provided by the newly\r |
| 856 | proposed [64]I/O latency controller . It allows administrators\r |
| 857 | to set target latencies for I/O requests originating from each\r |
| 858 | group of processes, and favors the groups with the lowest\r |
| 859 | target latency.\r |
| 860 | \r |
| 861 | The testbed\r |
| 862 | \r |
| 863 | I ran the tests with an ext4 filesystem mounted on a PLEXTOR\r |
| 864 | PX-256M5S SSD, which features a peak rate of ~160MB/s with\r |
| 865 | random I/O, and of ~500MB/s with sequential I/O. I used\r |
| 866 | blk-mq, in Linux 4.18. The system was equipped with a 2.4GHz\r |
| 867 | Intel Core i7-2760QM CPU and 1.3GHz DDR3 DRAM. In such a\r |
| 868 | system, a single thread doing synchronous reads reaches a\r |
| 869 | throughput of 23MB/s.\r |
| 870 | \r |
| 871 | For the purposes of these tests, each process is considered to\r |
| 872 | be in one of two groups, termed "target" and "interferers". A\r |
| 873 | target is a single-process, I/O-bound group whose I/O is\r |
| 874 | focused on. In particular, I measure the I/O throughput\r |
| 875 | enjoyed by this group to get the minimum bandwidth delivered\r |
| 876 | to the group. An interferer is single-process group whose role\r |
| 877 | is to generate additional I/O that interferes with the I/O of\r |
| 878 | the target. The tested workloads contain one target and\r |
| 879 | multiple interferers.\r |
| 880 | \r |
| 881 | The single process in each group either reads or writes,\r |
| 882 | through asynchronous (buffered) operations, to one file —\r |
| 883 | different from the file read or written by any other process —\r |
| 884 | after invalidating the buffer cache for the file. I define a\r |
| 885 | reader or writer process as either "random" or "sequential",\r |
| 886 | depending on whether it reads or writes its file at random\r |
| 887 | positions or sequentially. Finally, an interferer is defined\r |
| 888 | as being either "active" or "inactive" depending on whether it\r |
| 889 | performs I/O during the test. When an interferer is mentioned,\r |
| 890 | it is assumed that the interferer is active.\r |
| 891 | \r |
| 892 | Workloads are defined so as to try to cover the combinations\r |
| 893 | that, I believe, most influence the performance of the storage\r |
| 894 | device and of the I/O policies. For brevity, in this article I\r |
| 895 | show results for only two groups of workloads:\r |
| 896 | \r |
| 897 | Static sequential : four synchronous sequential readers or\r |
| 898 | four asynchronous sequential writers, plus five inactive\r |
| 899 | interferers.\r |
| 900 | \r |
| 901 | Static random : four synchronous random readers, all with a\r |
| 902 | block size equal to 4k, plus five inactive interferers.\r |
| 903 | \r |
| 904 | To create each workload, I considered, for each mix of\r |
| 905 | interferers in the group, two possibilities for the target: it\r |
| 906 | could be either a random or a sequential synchronous reader.\r |
| 907 | In [65]a longer version of this article [PDF] , you will also\r |
| 908 | find results for workloads with varying degrees of I/O\r |
| 909 | randomness, and for dynamic workloads (containing sporadic I/O\r |
| 910 | sources). These extra results confirm the losses of throughput\r |
| 911 | and I/O control for throttling that are shown here.\r |
| 912 | \r |
| 913 | I/O policies\r |
| 914 | \r |
| 915 | Linux provides two I/O-control mechanisms for guaranteeing (a\r |
| 916 | minimum) bandwidth, or at least fairness, to long-lived flows:\r |
| 917 | the throttling and proportional-share I/O policies. With\r |
| 918 | throttling, one can set a maximum bandwidth limit — "max\r |
| 919 | limit" for brevity — for the I/O of each group. Max limits can\r |
| 920 | be used, in an indirect way, to provide the service guarantee\r |
| 921 | at the focus of this article. For example, to guarantee\r |
| 922 | minimum bandwidths to I/O flows, a group can be guaranteed a\r |
| 923 | minimum bandwidth by limiting the maximum bandwidth of all the\r |
| 924 | other groups.\r |
| 925 | \r |
| 926 | Unfortunately, max limits have two drawbacks in terms of\r |
| 927 | throughput. First, if some groups do not use their allocated\r |
| 928 | bandwidth, that bandwidth cannot be reclaimed by other active\r |
| 929 | groups. Second, limits must comply with the worst-case speed\r |
| 930 | of the device, namely, its random-I/O peak rate. Such limits\r |
| 931 | will clearly leave a lot of throughput unused with workloads\r |
| 932 | that otherwise would drive the device to higher throughput\r |
| 933 | levels. Maximizing throughput is simply not a goal of max\r |
| 934 | limits. So, for brevity, test results with max limits are not\r |
| 935 | shown here. You can find these results, plus a more detailed\r |
| 936 | description of the above drawbacks, in the long version of\r |
| 937 | this article.\r |
| 938 | \r |
| 939 | Because of these drawbacks, a new, still experimental, low\r |
| 940 | limit has been added to the throttling policy. If a group is\r |
| 941 | assigned a low limit, then the throttling policy automatically\r |
| 942 | limits the I/O of the other groups in such a way to guarantee\r |
| 943 | to the group a minimum bandwidth equal to its assigned low\r |
| 944 | limit. This new throttling mechanism throttles no group as\r |
| 945 | long as every group is getting at least its assigned minimum\r |
| 946 | bandwidth. I tested this mechanism, but did not consider the\r |
| 947 | interesting problem of guaranteeing minimum bandwidths while,\r |
| 948 | at the same time, enforcing maximum bandwidths.\r |
| 949 | \r |
| 950 | The other I/O policy available in Linux, proportional share,\r |
| 951 | provides weighted fairness. Each group is assigned a weight,\r |
| 952 | and should receive a portion of the total throughput\r |
| 953 | proportional to its weight. This scheme guarantees minimum\r |
| 954 | bandwidths in the same way that low limits do in throttling.\r |
| 955 | In particular, it guarantees to each group a minimum bandwidth\r |
| 956 | equal to the ratio between the weight of the group, and the\r |
| 957 | sum of the weights of all the groups that may be active at the\r |
| 958 | same time.\r |
| 959 | \r |
| 960 | The actual implementation of the proportional-share policy, on\r |
| 961 | a given drive, depends on what flavor of the block layer is in\r |
| 962 | use for that drive. If the drive is using the legacy block\r |
| 963 | interface, the policy is implemented by the cfq I/O scheduler.\r |
| 964 | Unfortunately, cfq fails to control bandwidths with\r |
| 965 | flash-based storage, especially on drives featuring command\r |
| 966 | queueing. This case is not considered in these tests. With\r |
| 967 | drives using the multiqueue interface, proportional share is\r |
| 968 | implemented by bfq. This is the combination considered in the\r |
| 969 | tests.\r |
| 970 | \r |
| 971 | To benchmark both throttling (low limits) and proportional\r |
| 972 | share, I tested, for each workload, the combinations of I/O\r |
| 973 | policies and I/O schedulers reported in the table below. In\r |
| 974 | the end, there are three test cases for each workload. In\r |
| 975 | addition, for some workloads, I considered two versions of bfq\r |
| 976 | for the proportional-share policy.\r |
| 977 | \r |
| 978 | Name\r |
| 979 | \r |
| 980 | I/O policy\r |
| 981 | \r |
| 982 | Scheduler\r |
| 983 | \r |
| 984 | Parameter for target\r |
| 985 | \r |
| 986 | Parameter for each of the four active interferers\r |
| 987 | \r |
| 988 | Parameter for each of the five inactive interferers\r |
| 989 | \r |
| 990 | Sum of parameters\r |
| 991 | \r |
| 992 | low-none\r |
| 993 | \r |
| 994 | Throttling with low limits\r |
| 995 | \r |
| 996 | none\r |
| 997 | \r |
| 998 | 10MB/s\r |
| 999 | \r |
| 1000 | 10MB/s (tot: 40)\r |
| 1001 | \r |
| 1002 | 20MB/s (tot: 100)\r |
| 1003 | \r |
| 1004 | 150MB/s\r |
| 1005 | \r |
| 1006 | prop-bfq\r |
| 1007 | \r |
| 1008 | Proportional share\r |
| 1009 | \r |
| 1010 | bfq\r |
| 1011 | \r |
| 1012 | 300\r |
| 1013 | \r |
| 1014 | 100 (tot: 400)\r |
| 1015 | \r |
| 1016 | 200 (tot: 1000)\r |
| 1017 | \r |
| 1018 | 1700\r |
| 1019 | \r |
| 1020 | For low limits, I report results with only none as the I/O\r |
| 1021 | scheduler, because the results are the same with kyber and\r |
| 1022 | mq-deadline.\r |
| 1023 | \r |
| 1024 | The capabilities of the storage medium and of low limits drove\r |
| 1025 | the policy configurations. In particular:\r |
| 1026 | \r |
| 1027 | The configuration of the target and of the active interferers\r |
| 1028 | for low-none is the one for which low-none provides its best\r |
| 1029 | possible minimum-bandwidth guarantee to the target: 10MB/s,\r |
| 1030 | guaranteed if all interferers are readers. Results remain the\r |
| 1031 | same regardless of the values used for target latency and idle\r |
| 1032 | time; I set them to 100µs and 1000µs, respectively, for every\r |
| 1033 | group.\r |
| 1034 | \r |
| 1035 | Low limits for inactive interferers are set to twice the\r |
| 1036 | limits for active interferers, to pose greater difficulties to\r |
| 1037 | the policy.\r |
| 1038 | \r |
| 1039 | I chose weights for prop-bfq so as to guarantee about the same\r |
| 1040 | minimum bandwidth as low-none to the target, in the same\r |
| 1041 | only-reader worst case as for low-none and to preserve,\r |
| 1042 | between the weights of active and inactive interferers, the\r |
| 1043 | same ratio as between the low limits of active and inactive\r |
| 1044 | interferers.\r |
| 1045 | \r |
| 1046 | Full details on configurations can be found in the long\r |
| 1047 | version of this article.\r |
| 1048 | \r |
| 1049 | Each workload was run ten times for each policy, plus ten\r |
| 1050 | times without any I/O control, i.e., with none as I/O\r |
| 1051 | scheduler and no I/O policy in use. For each run, I measured\r |
| 1052 | the I/O throughput of the target (which reveals the bandwidth\r |
| 1053 | provided to the target), the cumulative I/O throughput of the\r |
| 1054 | interferers, and the total I/O throughput. These quantities\r |
| 1055 | fluctuated very little during each run, as well as across\r |
| 1056 | different runs. Thus in the graphs I report only averages over\r |
| 1057 | per-run average throughputs. In particular, for the case of no\r |
| 1058 | I/O control, I report only the total I/O throughput, to give\r |
| 1059 | an idea of the throughput that can be reached without imposing\r |
| 1060 | any control.\r |
| 1061 | \r |
| 1062 | Results\r |
| 1063 | \r |
| 1064 | This plot shows throughput results for the simplest group of\r |
| 1065 | workloads: the static-sequential set.\r |
| 1066 | \r |
| 1067 | With a random reader as the target against sequential readers\r |
| 1068 | as interferers, low-none does guarantee the configured low\r |
| 1069 | limit to the target. Yet it reaches only a low total\r |
| 1070 | throughput. The throughput of the random reader evidently\r |
| 1071 | oscillates around 10MB/s during the test. This implies that it\r |
| 1072 | is at least slightly below 10MB/s for a significant percentage\r |
| 1073 | of the time. But when this happens, the low-limit mechanism\r |
| 1074 | limits the maximum bandwidth of every active group to the low\r |
| 1075 | limit set for the group, i.e., to just 10MB/s. The end result\r |
| 1076 | is a total throughput lower than 10% of the throughput reached\r |
| 1077 | without I/O control.\r |
| 1078 | \r |
| 1079 | That said, the high throughput achieved without I/O control is\r |
| 1080 | obtained by choking the random I/O of the target in favor of\r |
| 1081 | the sequential I/O of the interferers. Thus, it is probably\r |
| 1082 | more interesting to compare low-none throughput with the\r |
| 1083 | throughput reachable while actually guaranteeing 10MB/s to the\r |
| 1084 | target. The target is a single, synchronous, random reader,\r |
| 1085 | which reaches 23MB/s while active. So, to guarantee 10MB/s to\r |
| 1086 | the target, it is enough to serve it for about half of the\r |
| 1087 | time, and the interferers for the other half. Since the device\r |
| 1088 | reaches ~500MB/s with the sequential I/O of the interferers,\r |
| 1089 | the resulting throughput with this service scheme would be\r |
| 1090 | (500+23)/2, or about 260MB/s. low-none thus reaches less than\r |
| 1091 | 20% of the total throughput that could be reached while still\r |
| 1092 | preserving the target bandwidth.\r |
| 1093 | \r |
| 1094 | prop-bfq provides the target with a slightly higher throughput\r |
| 1095 | than low-none. This makes it harder for prop-bfq to reach a\r |
| 1096 | high total throughput, because prop-bfq serves more random I/O\r |
| 1097 | (from the target) than low-none. Nevertheless, prop-bfq gets a\r |
| 1098 | much higher total throughput than low-none. According to the\r |
| 1099 | above estimate, this throughput is about 90% of the maximum\r |
| 1100 | throughput that could be reached, for this workload, without\r |
| 1101 | violating service guarantees. The reason for this good result\r |
| 1102 | is that bfq provides an effective implementation of the\r |
| 1103 | proportional-share service policy. At any time, each active\r |
| 1104 | group is granted a fraction of the current total throughput,\r |
| 1105 | and the sum of these fractions is equal to one; so group\r |
| 1106 | bandwidths naturally saturate the available total throughput\r |
| 1107 | at all times.\r |
| 1108 | \r |
| 1109 | Things change with the second workload: a random reader\r |
| 1110 | against sequential writers. Now low-none reaches a much higher\r |
| 1111 | total throughput than prop-bfq. low-none serves much more\r |
| 1112 | sequential (write) I/O than prop-bfq because writes somehow\r |
| 1113 | break the low-limit mechanisms and prevail over the reads of\r |
| 1114 | the target. Conceivably, this happens because writes tend to\r |
| 1115 | both starve reads in the OS (mainly by eating all available\r |
| 1116 | I/O tags) and to cheat on their completion time in the drive.\r |
| 1117 | In contrast, bfq is intentionally configured to privilege\r |
| 1118 | reads, to counter these issues.\r |
| 1119 | \r |
| 1120 | In particular, low-none gets an even higher throughput than no\r |
| 1121 | I/O control at all because it penalizes the random I/O of the\r |
| 1122 | target even more than the no-controller configuration.\r |
| 1123 | \r |
| 1124 | Finally, with the last two workloads, prop-bfq reaches even\r |
| 1125 | higher total throughput than with the first two. It happens\r |
| 1126 | because the target also does sequential I/O, and serving\r |
| 1127 | sequential I/O is much more beneficial for throughput than\r |
| 1128 | serving random I/O. With these two workloads, the total\r |
| 1129 | throughput is, respectively, close to or much higher than that\r |
| 1130 | reached without I/O control. For the last workload, the total\r |
| 1131 | throughput is much higher because, differently from none, bfq\r |
| 1132 | privileges reads over asynchronous writes, and reads yield a\r |
| 1133 | higher throughput than writes. In contrast, low-none still\r |
| 1134 | gets lower or much lower throughput than prop-bfq, because of\r |
| 1135 | the same issues that hinder low-none throughput with the first\r |
| 1136 | two workloads.\r |
| 1137 | \r |
| 1138 | As for bandwidth guarantees, with readers as interferers\r |
| 1139 | (third workload), prop-bfq, as expected, gives the target a\r |
| 1140 | fraction of the total throughput proportional to its weight.\r |
| 1141 | bfq approximates perfect proportional-share bandwidth\r |
| 1142 | distribution among groups doing I/O of the same type (reads or\r |
| 1143 | writes) and with the same locality (sequential or random).\r |
| 1144 | With the last workload, prop-bfq gives much more throughput to\r |
| 1145 | the reader than to all the interferers, because interferers\r |
| 1146 | are asynchronous writers, and bfq privileges reads.\r |
| 1147 | \r |
| 1148 | The second group of workloads (static random), is the one,\r |
| 1149 | among all the workloads considered, for which prop-bfq\r |
| 1150 | performs worst. Results are shown below:\r |
| 1151 | \r |
| 1152 | This chart reports results not only for mainline bfq, but also\r |
| 1153 | for an improved version of bfq which is currently under public\r |
| 1154 | testing. As can be seen, with only random readers, prop-bfq\r |
| 1155 | reaches a much lower total throughput than low-none. This\r |
| 1156 | happens because of the Achilles heel of the bfq I/O scheduler.\r |
| 1157 | If the process in service does synchronous I/O and has a\r |
| 1158 | higher weight than some other process, then, to give strong\r |
| 1159 | bandwidth guarantees to that process, bfq plugs I/O\r |
| 1160 | dispatching every time the process temporarily stops issuing\r |
| 1161 | I/O requests. In this respect, processes actually have\r |
| 1162 | differentiated weights and do synchronous I/O in the workloads\r |
| 1163 | tested. So bfq systematically performs I/O plugging for them.\r |
| 1164 | Unfortunately, this plugging empties the internal queues of\r |
| 1165 | the drive, which kills throughput with random I/O. And the I/O\r |
| 1166 | of all processes in these workloads is also random.\r |
| 1167 | \r |
| 1168 | The situation reverses with a sequential reader as target.\r |
| 1169 | Yet, the most interesting results come from the new version of\r |
| 1170 | bfq, containing small changes to counter exactly the above\r |
| 1171 | weakness. This version recovers most of the throughput loss\r |
| 1172 | with the workload made of only random I/O and more; with the\r |
| 1173 | second workload, where the target is a sequential reader, it\r |
| 1174 | reaches about 3.7 times the total throughput of low-none.\r |
| 1175 | \r |
| 1176 | When the main concern is the latency of flows containing short\r |
| 1177 | I/O, Linux seems now rather high performing, thanks to the bfq\r |
| 1178 | I/O scheduler and the I/O latency controller. But if the\r |
| 1179 | requirement is to provide explicit bandwidth guarantees (or\r |
| 1180 | just fairness) to I/O flows, then one must be ready to give up\r |
| 1181 | much or most of the speed of the storage media. bfq helps with\r |
| 1182 | some workloads, but loses most of the throughput with\r |
| 1183 | workloads consisting of mostly random I/O. Fortunately, there\r |
| 1184 | is apparently hope for much better performance since an\r |
| 1185 | improvement, still under development, seems to enable bfq to\r |
| 1186 | reach a high throughput with all workloads tested so far.\r |
| 1187 | \r |
| 1188 | [ I wish to thank Vivek Goyal for enabling me to make this\r |
| 1189 | article much more fair and sound.]\r |
| 1190 | \r |
| 1191 | [66]Comments (4 posted)\r |
| 1192 | \r |
| 1193 | [67]KDE's onboarding initiative, one year later\r |
| 1194 | \r |
| 1195 | August 24, 2018\r |
| 1196 | \r |
| 1197 | This article was contributed by Marta Rybczyńska\r |
| 1198 | \r |
| 1199 | [68]Akademy\r |
| 1200 | \r |
| 1201 | In 2017, the KDE community decided on [69]three goals to\r |
| 1202 | concentrate on for the next few years. One of them was\r |
| 1203 | [70]streamlining the onboarding of new contributors (the\r |
| 1204 | others were [71]improving usability and [72]privacy ). During\r |
| 1205 | [73]Akademy , the yearly KDE conference that was held in\r |
| 1206 | Vienna in August, Neofytos Kolokotronis shared the status of\r |
| 1207 | the onboarding goal, the work done during the last year, and\r |
| 1208 | further plans. While it is a complicated process in a project\r |
| 1209 | as big and diverse as KDE, numerous improvements have been\r |
| 1210 | already made.\r |
| 1211 | \r |
| 1212 | Two of the three KDE community goals were proposed by relative\r |
| 1213 | newcomers. Kolokotronis was one of those, having joined the\r |
| 1214 | [74]KDE Promo team not long before proposing the focus on\r |
| 1215 | onboarding. He had previously been involved with [75]Chakra\r |
| 1216 | Linux , a distribution based on KDE software. The fact that\r |
| 1217 | new members of the community proposed strategic goals was also\r |
| 1218 | noted in the [76]Sunday keynote by Claudia Garad .\r |
| 1219 | \r |
| 1220 | Proper onboarding adds excitement to the contribution process\r |
| 1221 | and increases retention, he explained. When we look at [77]the\r |
| 1222 | definition of onboarding , it is a process in which the new\r |
| 1223 | contributors acquire knowledge, skills, and behaviors so that\r |
| 1224 | they can contribute effectively. Kolokotronis proposed to see\r |
| 1225 | it also as socialization: integration into the project's\r |
| 1226 | relationships, culture, structure, and procedures.\r |
| 1227 | \r |
| 1228 | The gains from proper onboarding are many. The project can\r |
| 1229 | grow by attracting new blood with new perspectives and\r |
| 1230 | solutions. The community maintains its health and stays\r |
| 1231 | vibrant. Another important advantage of efficient onboarding\r |
| 1232 | is that replacing current contributors becomes easier when\r |
| 1233 | they change interests, jobs, or leave the project for whatever\r |
| 1234 | reason. Finally, successful onboarding adds new advocates to\r |
| 1235 | the project.\r |
| 1236 | \r |
| 1237 | Achievements so far and future plans\r |
| 1238 | \r |
| 1239 | The team started with ideas for a centralized onboarding\r |
| 1240 | process for the whole of KDE. They found out quickly that this\r |
| 1241 | would not work because KDE is "very decentralized", so it is\r |
| 1242 | hard to provide tools and procedures that are going to work\r |
| 1243 | for the whole project. According to Kolokotronis, other\r |
| 1244 | characteristics of KDE that impact onboarding are high\r |
| 1245 | diversity, remote and online teams, and hundreds of\r |
| 1246 | contributors in dozens of projects and teams. In addition, new\r |
| 1247 | contributors already know in which area they want to take part\r |
| 1248 | and they prefer specific information that will be directly\r |
| 1249 | useful for them.\r |
| 1250 | \r |
| 1251 | So the team changed its approach; several changes have since\r |
| 1252 | been proposed and implemented. The [78]Get Involved page,\r |
| 1253 | which is expected to be one of the resources new contributors\r |
| 1254 | read first, has been rewritten. For the [79]Junior Jobs page ,\r |
| 1255 | the team is [80] [81]discussing what the generic content for\r |
| 1256 | KDE as a whole should be. The team simplified [82]Phabricator\r |
| 1257 | registration , which resulted in documenting the process\r |
| 1258 | better. Another part of the work includes the [83]KDE Bugzilla\r |
| 1259 | ; it includes, for example initiatives to limit the number of\r |
| 1260 | states of a ticket or remove obsolete products.\r |
| 1261 | \r |
| 1262 | The [84]Plasma Mobile team is heavily involved in the\r |
| 1263 | onboarding goal. The Plasma Mobile developers have simplified\r |
| 1264 | their development environment setup and created an\r |
| 1265 | [85]interactive "Get Involved" page. In addition, the Plasma\r |
| 1266 | team changed the way task descriptions are written; they now\r |
| 1267 | contain more detail, so that it is easier to get involved. The\r |
| 1268 | basic description should be short and clear, and it should\r |
| 1269 | include details of the problem and possible solutions. The\r |
| 1270 | developers try to share the list of skills necessary to\r |
| 1271 | fulfill the tasks and include clear links to the technical\r |
| 1272 | resources needed.\r |
| 1273 | \r |
| 1274 | Kolokotronis and team also identified a new potential source\r |
| 1275 | of contributors for KDE: distributions using KDE. They have\r |
| 1276 | the advantage of already knowing and using the software. The\r |
| 1277 | next idea the team is working on is to make sure that setting\r |
| 1278 | up a development environment is easy. The team plans to work\r |
| 1279 | on this during a dedicated sprint this autumn.\r |
| 1280 | \r |
| 1281 | Searching for new contributors\r |
| 1282 | \r |
| 1283 | Kolokotronis plans to search for new contributors at the\r |
| 1284 | periphery of the project, among the "skilled enthusiasts":\r |
| 1285 | loyal users who actually care about the project. They "can\r |
| 1286 | make wonders", he said. Those individuals may be also less\r |
| 1287 | confident or shy, have troubles making the first step, and\r |
| 1288 | need guidance. The project leaders should take that into\r |
| 1289 | account.\r |
| 1290 | \r |
| 1291 | In addition, newcomers are all different. Kolokotronis\r |
| 1292 | provided a long list of how contributors differ, including\r |
| 1293 | skills and knowledge, motives and interests, and time and\r |
| 1294 | dedication. His advice is to "try to find their superpower",\r |
| 1295 | the skills they have that are missing in the team. Those\r |
| 1296 | "superpowers" can then be used for the benefit of the project.\r |
| 1297 | \r |
| 1298 | If a project does nothing else, he said, it can start with its\r |
| 1299 | documentation. However, this does not only mean code\r |
| 1300 | documentation. Writing down the procedures or information\r |
| 1301 | about the internal work of the project, like who is working on\r |
| 1302 | what, is an important part of a project's documentation and\r |
| 1303 | helps newcomers. There should be also guidelines on how to\r |
| 1304 | start, especially setting up the development environment.\r |
| 1305 | \r |
| 1306 | The first thing the project leaders should do, according to\r |
| 1307 | Kolokotronis, is to spend time on introducing newcomers to the\r |
| 1308 | project. Ideally every new contributor should be assigned\r |
| 1309 | mentors — more experienced members who can help them when\r |
| 1310 | needed. The mentors and project leaders should find tasks that\r |
| 1311 | are interesting for each person. Answering an audience\r |
| 1312 | question on suggestions for shy new contributors, he\r |
| 1313 | recommended even more mentoring. It is also very helpful to\r |
| 1314 | make sure that newcomers have enough to read, but "avoid\r |
| 1315 | RTFM", he highlighted. It is also easy for a new contributor\r |
| 1316 | "to fly away", he said. The solution is to keep requesting\r |
| 1317 | things and be proactive.\r |
| 1318 | \r |
| 1319 | What the project can do?\r |
| 1320 | \r |
| 1321 | Kolokotronis suggested a number of actions for a project when\r |
| 1322 | it wants to improve its onboarding. The first step is\r |
| 1323 | preparation: the project leaders should know the team's and\r |
| 1324 | the project's needs. Long-term planning is important, too. It\r |
| 1325 | is not enough to wait for contributors to come — the project\r |
| 1326 | should be proactive, which means reaching out to candidates,\r |
| 1327 | suggesting appropriate tasks and, finally, making people\r |
| 1328 | available for the newcomers if they need help.\r |
| 1329 | \r |
| 1330 | This leads to next step: to be a mentor. Kolokotronis suggests\r |
| 1331 | being a "great host", but also trying to phase out the\r |
| 1332 | dependency on the mentor rapidly. "We have been all\r |
| 1333 | newcomers", he said. It can be intimidating to join an\r |
| 1334 | existing group. Onboarding creates a sense of belonging which,\r |
| 1335 | in turn, increases retention.\r |
| 1336 | \r |
| 1337 | The last step proposed was to be strategic. This includes\r |
| 1338 | thinking about the emotions you want newcomers to feel.\r |
| 1339 | Kolokotronis explained the strategic part with an example. The\r |
| 1340 | overall goal is (surprise!) improve onboarding of new\r |
| 1341 | contributors. An intermediate objective might be to keep the\r |
| 1342 | newcomers after they have made their first commit. If your\r |
| 1343 | strategy is to keep them confident and proud, you can use\r |
| 1344 | different tactics like praise and acknowledgment of the work\r |
| 1345 | in public. Another useful tactic may be assigning simple\r |
| 1346 | tasks, according to the skill of the contributor.\r |
| 1347 | \r |
| 1348 | To summarize, the most important thing, according to\r |
| 1349 | Kolokotronis, is to respond quickly and spend time with new\r |
| 1350 | contributors. This time should be used to explain procedures,\r |
| 1351 | and to introduce the people and culture. It is also essential\r |
| 1352 | to guide first contributions and praise contributor's skill\r |
| 1353 | and effort. Increase the difficulty of tasks over time to keep\r |
| 1354 | contributors motivated and challenged. And finally, he said,\r |
| 1355 | "turn them into mentors".\r |
| 1356 | \r |
| 1357 | Kolokotronis acknowledges that onboarding "takes time" and\r |
| 1358 | "everyone complains" about it. However, he is convinced that\r |
| 1359 | it is beneficial in the long term and that it decreases\r |
| 1360 | developer turnover.\r |
| 1361 | \r |
| 1362 | Advice to newcomers\r |
| 1363 | \r |
| 1364 | Kolokotronis concluded with some suggestions for newcomers to\r |
| 1365 | a project. They should try to be persistent and to not get\r |
| 1366 | discouraged when something goes wrong. Building connections\r |
| 1367 | from the very beginning is helpful. He suggests asking\r |
| 1368 | questions as if you were already a member "and things will be\r |
| 1369 | fine". However, accept criticism if it happens.\r |
| 1370 | \r |
| 1371 | One of the next actions of the onboarding team will be to\r |
| 1372 | collect feedback from newcomers and experienced contributors\r |
| 1373 | to see if they agree on the ideas and processes introduced so\r |
| 1374 | far.\r |
| 1375 | \r |
| 1376 | [86]Comments (none posted)\r |
| 1377 | \r |
| 1378 | [87]Sharing and archiving data sets with Dat\r |
| 1379 | \r |
| 1380 | August 27, 2018\r |
| 1381 | \r |
| 1382 | This article was contributed by Antoine Beaupré\r |
| 1383 | \r |
| 1384 | [88]Dat is a new peer-to-peer protocol that uses some of the\r |
| 1385 | concepts of [89]BitTorrent and Git. Dat primarily targets\r |
| 1386 | researchers and open-data activists as it is a great tool for\r |
| 1387 | sharing, archiving, and cataloging large data sets. But it can\r |
| 1388 | also be used to implement decentralized web applications in a\r |
| 1389 | novel way.\r |
| 1390 | \r |
| 1391 | Dat quick primer\r |
| 1392 | \r |
| 1393 | Dat is written in JavaScript, so it can be installed with npm\r |
| 1394 | , but there are [90]standalone binary builds and a [91]desktop\r |
| 1395 | application (as an AppImage). An [92]online viewer can be used\r |
| 1396 | to inspect data for those who do not want to install arbitrary\r |
| 1397 | binaries on their computers.\r |
| 1398 | \r |
| 1399 | The command-line application allows basic operations like\r |
| 1400 | downloading existing data sets and sharing your own. Dat uses\r |
| 1401 | a 32-byte hex string that is an [93]ed25519 public key , which\r |
| 1402 | is is used to discover and find content on the net. For\r |
| 1403 | example, this will download some sample data: $ dat clone \\r |
| 1404 | \r |
| 1405 | dat://778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943-\r |
| 1406 | 666fe639 \\r |
| 1407 | \r |
| 1408 | ~/Downloads/dat-demo\r |
| 1409 | \r |
| 1410 | Similarly, the share command is used to share content. It\r |
| 1411 | indexes the files in a given directory and creates a new\r |
| 1412 | unique address like the one above. The share command starts a\r |
| 1413 | server that uses multiple discovery mechanisms (currently, the\r |
| 1414 | [94]Mainline Distributed Hash Table (DHT), a [95]custom DNS\r |
| 1415 | server , and multicast DNS) to announce the content to its\r |
| 1416 | peers. This is how another user, armed with that public key,\r |
| 1417 | can download that content with dat clone or mirror the files\r |
| 1418 | continuously with dat sync .\r |
| 1419 | \r |
| 1420 | So far, this looks a lot like BitTorrent [96]magnet links\r |
| 1421 | updated with 21st century cryptography. But Dat adds revisions\r |
| 1422 | on top of that, so modifications are automatically shared\r |
| 1423 | through the swarm. That is important for public data sets as\r |
| 1424 | those are often dynamic in nature. Revisions also make it\r |
| 1425 | possible to use [97]Dat as a backup system by saving the data\r |
| 1426 | incrementally using an [98]archiver .\r |
| 1427 | \r |
| 1428 | While Dat is designed to work on larger data sets, processing\r |
| 1429 | them for sharing may take a while. For example, sharing the\r |
| 1430 | Linux kernel source code required about five minutes as Dat\r |
| 1431 | worked on indexing all of the files. This is comparable to the\r |
| 1432 | performance offered by [99]IPFS and BitTorrent. Data sets with\r |
| 1433 | more or larger files may take quite a bit more time.\r |
| 1434 | \r |
| 1435 | One advantage that Dat has over IPFS is that it doesn't\r |
| 1436 | duplicate the data. When IPFS imports new data, it duplicates\r |
| 1437 | the files into ~/.ipfs . For collections of small files like\r |
| 1438 | the kernel, this is not a huge problem, but for larger files\r |
| 1439 | like videos or music, it's a significant limitation. IPFS\r |
| 1440 | eventually implemented a solution to this [100]problem in the\r |
| 1441 | form of the experimental [101]filestore feature , but it's not\r |
| 1442 | enabled by default. Even with that feature enabled, though,\r |
| 1443 | changes to data sets are not automatically tracked. In\r |
| 1444 | comparison, Dat operation on dynamic data feels much lighter.\r |
| 1445 | The downside is that each set needs its own dat share process.\r |
| 1446 | \r |
| 1447 | Like any peer-to-peer system, Dat needs at least one peer to\r |
| 1448 | stay online to offer the content, which is impractical for\r |
| 1449 | mobile devices. Hosting providers like [102]Hashbase (which is\r |
| 1450 | a [103]pinning service in Dat jargon) can help users keep\r |
| 1451 | content online without running their own [104]server . The\r |
| 1452 | closest parallel in the traditional web ecosystem would\r |
| 1453 | probably be content distribution networks (CDN) although\r |
| 1454 | pinning services are not necessarily geographically\r |
| 1455 | distributed and a CDN does not necessarily retain a complete\r |
| 1456 | copy of a website. [105]\r |
| 1457 | \r |
| 1458 | A web browser called [106]Beaker , based on the [107]Electron\r |
| 1459 | framework, can access Dat content natively without going\r |
| 1460 | through a pinning service. Furthermore, Beaker is essential to\r |
| 1461 | get any of the [108]Dat applications working, as they\r |
| 1462 | fundamentally rely on dat:// URLs to do their magic. This\r |
| 1463 | means that Dat applications won't work for most users unless\r |
| 1464 | they install that special web browser. There is a [109]Firefox\r |
| 1465 | extension called " [110]dat-fox " for people who don't want to\r |
| 1466 | install yet another browser, but it requires installing a\r |
| 1467 | [111]helper program . The extension will be able to load\r |
| 1468 | dat:// URLs but many applications will still not work. For\r |
| 1469 | example, the [112]photo gallery application completely fails\r |
| 1470 | with dat-fox.\r |
| 1471 | \r |
| 1472 | Dat-based applications look promising from a privacy point of\r |
| 1473 | view. Because of its peer-to-peer nature, users regain control\r |
| 1474 | over where their data is stored: either on their own computer,\r |
| 1475 | an online server, or by a trusted third party. But considering\r |
| 1476 | the protocol is not well established in current web browsers,\r |
| 1477 | I foresee difficulties in adoption of that aspect of the Dat\r |
| 1478 | ecosystem. Beyond that, it is rather disappointing that Dat\r |
| 1479 | applications cannot run natively in a web browser given that\r |
| 1480 | JavaScript is designed exactly for that.\r |
| 1481 | \r |
| 1482 | Dat privacy\r |
| 1483 | \r |
| 1484 | An advantage Dat has over other peer-to-peer protocols like\r |
| 1485 | BitTorrent is end-to-end encryption. I was originally\r |
| 1486 | concerned by the encryption design when reading the\r |
| 1487 | [113]academic paper [PDF] :\r |
| 1488 | \r |
| 1489 | It is up to client programs to make design decisions around\r |
| 1490 | which discovery networks they trust. For example if a Dat\r |
| 1491 | client decides to use the BitTorrent DHT to discover peers,\r |
| 1492 | and they are searching for a publicly shared Dat key (e.g. a\r |
| 1493 | key cited publicly in a published scientific paper) with known\r |
| 1494 | contents, then because of the privacy design of the BitTorrent\r |
| 1495 | DHT it becomes public knowledge what key that client is\r |
| 1496 | searching for.\r |
| 1497 | \r |
| 1498 | So in other words, to share a secret file with another user,\r |
| 1499 | the public key is transmitted over a secure side-channel, only\r |
| 1500 | to then leak during the discovery process. Fortunately, the\r |
| 1501 | public Dat key is not directly used during discovery as it is\r |
| 1502 | [114]hashed with BLAKE2B . Still, the security model of Dat\r |
| 1503 | assumes the public key is private, which is a rather\r |
| 1504 | counterintuitive concept that might upset cryptographers and\r |
| 1505 | confuse users who are frequently encouraged to type such\r |
| 1506 | strings in address bars and search engines as part of the Dat\r |
| 1507 | experience. There is a [115]security & privacy FAQ in the Dat\r |
| 1508 | documentation warning about this problem:\r |
| 1509 | \r |
| 1510 | One of the key elements of Dat privacy is that the public key\r |
| 1511 | is never used in any discovery network. The public key is\r |
| 1512 | hashed, creating the discovery key. Whenever peers attempt to\r |
| 1513 | connect to each other, they use the discovery key.\r |
| 1514 | \r |
| 1515 | Data is encrypted using the public key, so it is important\r |
| 1516 | that this key stays secure.\r |
| 1517 | \r |
| 1518 | There are other privacy issues outlined in the document; it\r |
| 1519 | states that " Dat faces similar privacy risks as BitTorrent ":\r |
| 1520 | \r |
| 1521 | When you download a dataset, your IP address is exposed to the\r |
| 1522 | users sharing that dataset. This may lead to honeypot servers\r |
| 1523 | collecting IP addresses, as we've seen in Bittorrent. However,\r |
| 1524 | with dataset sharing we can create a web of trust model where\r |
| 1525 | specific institutions are trusted as primary sources for\r |
| 1526 | datasets, diminishing the sharing of IP addresses.\r |
| 1527 | \r |
| 1528 | A Dat blog post refers to this issue as [116]reader privacy\r |
| 1529 | and it is, indeed, a sensitive issue in peer-to-peer networks.\r |
| 1530 | It is how BitTorrent users are discovered and served scary\r |
| 1531 | verbiage from lawyers, after all. But Dat makes this a little\r |
| 1532 | better because, to join a swarm, you must know what you are\r |
| 1533 | looking for already, which means peers who can look at swarm\r |
| 1534 | activity only include users who know the secret public key.\r |
| 1535 | This works well for secret content, but for larger, public\r |
| 1536 | data sets, it is a real problem; it is why the Dat project has\r |
| 1537 | [117]avoided creating a Wikipedia mirror so far.\r |
| 1538 | \r |
| 1539 | I found another privacy issue that is not documented in the\r |
| 1540 | security FAQ during my review of the protocol. As mentioned\r |
| 1541 | earlier, the [118]Dat discovery protocol routinely phones home\r |
| 1542 | to DNS servers operated by the Dat project. This implies that\r |
| 1543 | the default discovery servers (and an attacker watching over\r |
| 1544 | their traffic) know who is publishing or seeking content, in\r |
| 1545 | essence discovering the "social network" behind Dat. This\r |
| 1546 | discovery mechanism can be disabled in clients, but a similar\r |
| 1547 | privacy issue applies to the DHT as well, although that is\r |
| 1548 | distributed so it doesn't require trust of the Dat project\r |
| 1549 | itself.\r |
| 1550 | \r |
| 1551 | Considering those aspects of the protocol, privacy-conscious\r |
| 1552 | users will probably want to use Tor or other anonymization\r |
| 1553 | techniques to work around those concerns.\r |
| 1554 | \r |
| 1555 | The future of Dat\r |
| 1556 | \r |
| 1557 | [119]Dat 2.0 was released in June 2017 with performance\r |
| 1558 | improvements and protocol changes. [120]Dat Enhancement\r |
| 1559 | Proposals (DEPs) guide the project's future development; most\r |
| 1560 | work is currently geared toward implementing the draft "\r |
| 1561 | [121]multi-writer proposal " in [122]HyperDB . Without\r |
| 1562 | multi-writer support, only the original publisher of a Dat can\r |
| 1563 | modify it. According to Joe Hand, co-executive-director of\r |
| 1564 | [123]Code for Science & Society (CSS) and Dat core developer,\r |
| 1565 | in an IRC chat, "supporting multiwriter is a big requirement\r |
| 1566 | for lots of folks". For example, while Dat might allow Alice\r |
| 1567 | to share her research results with Bob, he cannot modify or\r |
| 1568 | contribute back to those results. The multi-writer extension\r |
| 1569 | allows for Alice to assign trust to Bob so he can have write\r |
| 1570 | access to the data.\r |
| 1571 | \r |
| 1572 | Unfortunately, the current proposal doesn't solve the " hard\r |
| 1573 | problems " of " conflict merges and secure key distribution ".\r |
| 1574 | The former will be worked out through user interface tweaks,\r |
| 1575 | but the latter is a classic problem that security projects\r |
| 1576 | have typically trouble finding solutions for—Dat is no\r |
| 1577 | exception. How will Alice securely trust Bob? The OpenPGP web\r |
| 1578 | of trust? Hexadecimal fingerprints read over the phone? Dat\r |
| 1579 | doesn't provide a magic solution to this problem.\r |
| 1580 | \r |
| 1581 | Another thing limiting adoption is that Dat is not packaged in\r |
| 1582 | any distribution that I could find (although I [124]requested\r |
| 1583 | it in Debian ) and, considering the speed of change of the\r |
| 1584 | JavaScript ecosystem, this is unlikely to change any time\r |
| 1585 | soon. A [125]Rust implementation of the Dat protocol has\r |
| 1586 | started, however, which might be easier to package than the\r |
| 1587 | multitude of [126]Node.js modules. In terms of mobile device\r |
| 1588 | support, there is an experimental Android web browser with Dat\r |
| 1589 | support called [127]Bunsen , which somehow doesn't run on my\r |
| 1590 | phone. Some adventurous users have successfully run Dat in\r |
| 1591 | [128]Termux . I haven't found an app running on iOS at this\r |
| 1592 | point.\r |
| 1593 | \r |
| 1594 | Even beyond platform support, distributed protocols like Dat\r |
| 1595 | have a tough slope to climb against the virtual monopoly of\r |
| 1596 | more centralized protocols, so it remains to be seen how\r |
| 1597 | popular those tools will be. Hand says Dat is supported by\r |
| 1598 | multiple non-profit organizations. Beyond CSS, [129]Blue Link\r |
| 1599 | Labs is working on the Beaker Browser as a self-funded startup\r |
| 1600 | and a grass-roots organization, [130]Digital Democracy , has\r |
| 1601 | contributed to the project. The [131]Internet Archive has\r |
| 1602 | [132]announced a collaboration between itself, CSS, and the\r |
| 1603 | California Digital Library to launch a pilot project to see "\r |
| 1604 | how members of a cooperative, decentralized network can\r |
| 1605 | leverage shared services to ensure data preservation while\r |
| 1606 | reducing storage costs and increasing replication counts ".\r |
| 1607 | \r |
| 1608 | Hand said adoption in academia has been "slow but steady" and\r |
| 1609 | that the [133]Dat in the Lab project has helped identify areas\r |
| 1610 | that could help researchers adopt the project. Unfortunately,\r |
| 1611 | as is the case with many free-software projects, he said that\r |
| 1612 | "our team is definitely a bit limited on bandwidth to push for\r |
| 1613 | bigger adoption". Hand said that the project received a grant\r |
| 1614 | from [134]Mozilla Open Source Support to improve its\r |
| 1615 | documentation, which will be a big help.\r |
| 1616 | \r |
| 1617 | Ultimately, Dat suffers from a problem common to all\r |
| 1618 | peer-to-peer applications, which is naming. Dat addresses are\r |
| 1619 | not exactly intuitive: humans do not remember strings of 64\r |
| 1620 | hexadecimal characters well. For this, Dat took a [135]similar\r |
| 1621 | approach to IPFS by using DNS TXT records and /.well-known URL\r |
| 1622 | paths to bridge existing, human-readable names with Dat\r |
| 1623 | hashes. So this sacrifices a part of the decentralized nature\r |
| 1624 | of the project in favor of usability.\r |
| 1625 | \r |
| 1626 | I have tested a lot of distributed protocols like Dat in the\r |
| 1627 | past and I am not sure Dat is a clear winner. It certainly has\r |
| 1628 | advantages over IPFS in terms of usability and resource usage,\r |
| 1629 | but the lack of packages on most platforms is a big limit to\r |
| 1630 | adoption for most people. This means it will be difficult to\r |
| 1631 | share content with my friends and family with Dat anytime\r |
| 1632 | soon, which would probably be my primary use case for the\r |
| 1633 | project. Until the protocol reaches the wider adoption that\r |
| 1634 | BitTorrent has seen in terms of platform support, I will\r |
| 1635 | probably wait before switching everything over to this\r |
| 1636 | promising project.\r |
| 1637 | \r |
| 1638 | [136]Comments (11 posted)\r |
| 1639 | \r |
| 1640 | Page editor : Jonathan Corbet\r |
| 1641 | \r |
| 1642 | Inside this week's LWN.net Weekly Edition\r |
| 1643 | \r |
| 1644 | [137]Briefs : OpenSSH 7.8; 4.19-rc1; Which stable?; Netdev\r |
| 1645 | 0x12; Bison 3.1; Quotes; ...\r |
| 1646 | \r |
| 1647 | [138]Announcements : Newsletters; events; security updates;\r |
| 1648 | kernel patches; ... Next page : [139]Brief items>>\r |
| 1649 | \r |
| 1650 | \r |
| 1651 | \r |
| 1652 | [1] https://lwn.net/Articles/763743/\r |
| 1653 | \r |
| 1654 | [2] https://lwn.net/Articles/763626/\r |
| 1655 | \r |
| 1656 | [3] https://lwn.net/Articles/763641/\r |
| 1657 | \r |
| 1658 | [4] https://lwn.net/Articles/763106/\r |
| 1659 | \r |
| 1660 | [5] https://lwn.net/Articles/763603/\r |
| 1661 | \r |
| 1662 | [6] https://lwn.net/Articles/763175/\r |
| 1663 | \r |
| 1664 | [7] https://lwn.net/Articles/763492/\r |
| 1665 | \r |
| 1666 | [8] https://lwn.net/Articles/763254/\r |
| 1667 | \r |
| 1668 | [9] https://lwn.net/Articles/763255/\r |
| 1669 | \r |
| 1670 | [10] https://lwn.net/Articles/763743/#Comments\r |
| 1671 | \r |
| 1672 | [11] https://lwn.net/Articles/763626/\r |
| 1673 | \r |
| 1674 | [12] http://julialang.org/\r |
| 1675 | \r |
| 1676 | [13] https://julialang.org/blog/2018/08/one-point-zero\r |
| 1677 | \r |
| 1678 | [14] https://julialang.org/benchmarks/\r |
| 1679 | \r |
| 1680 | [15] https://juliacomputing.com/\r |
| 1681 | \r |
| 1682 | [16] https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93p-\r |
| 1683 | rint_loop\r |
| 1684 | \r |
| 1685 | [17] http://llvm.org/\r |
| 1686 | \r |
| 1687 | [18] http://www.3blue1brown.com/essence-of-linear-algebra-page/\r |
| 1688 | \r |
| 1689 | [19] http://www.netlib.org/lapack/\r |
| 1690 | \r |
| 1691 | [20] https://lwn.net/Articles/657157/\r |
| 1692 | \r |
| 1693 | [21] https://julialang.org/publications/julia-fresh-approach-B-\r |
| 1694 | EKS.pdf\r |
| 1695 | \r |
| 1696 | [22] https://lwn.net/Articles/738915/\r |
| 1697 | \r |
| 1698 | [23] https://pypy.org/\r |
| 1699 | \r |
| 1700 | [24] https://github.com/JuliaPy/PyCall.jl\r |
| 1701 | \r |
| 1702 | [25] https://github.com/JuliaInterop/RCall.jl\r |
| 1703 | \r |
| 1704 | [26] https://docs.julialang.org/en/stable/\r |
| 1705 | \r |
| 1706 | [27] https://julialang.org/learning/\r |
| 1707 | \r |
| 1708 | [28] http://bogumilkaminski.pl/files/julia_express.pdf\r |
| 1709 | \r |
| 1710 | [29] https://docs.julialang.org/en/stable/manual/noteworthy-di-\r |
| 1711 | fferences/#Noteworthy-differences-from-Python-1\r |
| 1712 | \r |
| 1713 | [30] https://lwn.net/Articles/746386/\r |
| 1714 | \r |
| 1715 | [31] https://github.com/JuliaLang/IJulia.jl\r |
| 1716 | \r |
| 1717 | [32] https://lwn.net/Articles/764001/\r |
| 1718 | \r |
| 1719 | [33] https://lwn.net/Articles/763626/#Comments\r |
| 1720 | \r |
| 1721 | [34] https://lwn.net/Articles/763641/\r |
| 1722 | \r |
| 1723 | [35] https://lwn.net/Archives/ConferenceByYear/#2018-Linux_Sec-\r |
| 1724 | urity_Summit_NA\r |
| 1725 | \r |
| 1726 | [36] https://events.linuxfoundation.org/events/linux-security-\r |
| 1727 | summit-north-america-2018/\r |
| 1728 | \r |
| 1729 | [37] https://kernsec.org/wiki/index.php/Kernel_Self_Protection-\r |
| 1730 | _Project\r |
| 1731 | \r |
| 1732 | [38] https://lwn.net/Articles/763644/\r |
| 1733 | \r |
| 1734 | [39] https://raphlinus.github.io/programming/rust/2018/08/17/u-\r |
| 1735 | ndefined-behavior.html\r |
| 1736 | \r |
| 1737 | [40] https://lwn.net/Articles/749064/\r |
| 1738 | \r |
| 1739 | [41] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/-\r |
| 1740 | linux.git/commit/?id=02361bc77888\r |
| 1741 | \r |
| 1742 | [42] https://lore.kernel.org/lkml/CA+55aFzCG-zNmZwX4A2FQpadafL-\r |
| 1743 | fEzK6CC=qPXydAacU1RqZWA@mail.gmail.com/T/#u\r |
| 1744 | \r |
| 1745 | [43] https://lwn.net/Articles/758245/\r |
| 1746 | \r |
| 1747 | [44] https://lwn.net/Articles/718888/\r |
| 1748 | \r |
| 1749 | [45] https://lwn.net/Articles/744507/\r |
| 1750 | \r |
| 1751 | [46] https://outflux.net/slides/2018/lss/danger.pdf\r |
| 1752 | \r |
| 1753 | [47] https://lwn.net/Articles/763641/#Comments\r |
| 1754 | \r |
| 1755 | [48] https://lwn.net/Articles/763106/\r |
| 1756 | \r |
| 1757 | [49] https://lwn.net/Articles/763497/\r |
| 1758 | \r |
| 1759 | [50] https://lwn.net/Articles/762566/\r |
| 1760 | \r |
| 1761 | [51] https://lwn.net/Articles/761118/\r |
| 1762 | \r |
| 1763 | [52] https://git.kernel.org/linus/d5791044d2e5749ef4de84161cec-\r |
| 1764 | 5532e2111540\r |
| 1765 | \r |
| 1766 | [53] https://lwn.net/ml/linux-kernel/20180630000253.70103-1-sq-\r |
| 1767 | ue@chromium.org/\r |
| 1768 | \r |
| 1769 | [54] https://git.kernel.org/linus/771c035372a036f83353eef46dbb-\r |
| 1770 | 829780330234\r |
| 1771 | \r |
| 1772 | [55] https://lwn.net/Articles/745073/\r |
| 1773 | \r |
| 1774 | [56] https://lwn.net/ml/linux-kernel/CA+55aFxFjAmrFpwQmEHCthHO-\r |
| 1775 | zgidCKnod+cNDEE+3Spu9o1s3w@mail.gmail.com/\r |
| 1776 | \r |
| 1777 | [57] https://lwn.net/Articles/759499/\r |
| 1778 | \r |
| 1779 | [58] https://lwn.net/Articles/762355/\r |
| 1780 | \r |
| 1781 | [59] https://lwn.net/ml/linux-fsdevel/20180823223145.GK6515@Ze-\r |
| 1782 | nIV.linux.org.uk/\r |
| 1783 | \r |
| 1784 | [60] https://lwn.net/Articles/763106/#Comments\r |
| 1785 | \r |
| 1786 | [61] https://lwn.net/Articles/763603/\r |
| 1787 | \r |
| 1788 | [62] https://lwn.net/Articles/601799/\r |
| 1789 | \r |
| 1790 | [63] https://lwn.net/Articles/552904\r |
| 1791 | \r |
| 1792 | [64] https://lwn.net/Articles/758963/\r |
| 1793 | \r |
| 1794 | [65] http://algogroup.unimore.it/people/paolo/pub-docs/extende-\r |
| 1795 | d-lat-bw-throughput.pdf\r |
| 1796 | \r |
| 1797 | [66] https://lwn.net/Articles/763603/#Comments\r |
| 1798 | \r |
| 1799 | [67] https://lwn.net/Articles/763175/\r |
| 1800 | \r |
| 1801 | [68] https://lwn.net/Archives/ConferenceByYear/#2018-Akademy\r |
| 1802 | \r |
| 1803 | [69] https://dot.kde.org/2017/11/30/kdes-goals-2018-and-beyond\r |
| 1804 | \r |
| 1805 | [70] https://phabricator.kde.org/T7116\r |
| 1806 | \r |
| 1807 | [71] https://phabricator.kde.org/T6831\r |
| 1808 | \r |
| 1809 | [72] https://phabricator.kde.org/T7050\r |
| 1810 | \r |
| 1811 | [73] https://akademy.kde.org/\r |
| 1812 | \r |
| 1813 | [74] https://community.kde.org/Promo\r |
| 1814 | \r |
| 1815 | [75] https://www.chakralinux.org/\r |
| 1816 | \r |
| 1817 | [76] https://conf.kde.org/en/Akademy2018/public/events/79\r |
| 1818 | \r |
| 1819 | [77] https://en.wikipedia.org/wiki/Onboarding\r |
| 1820 | \r |
| 1821 | [78] https://community.kde.org/Get_Involved\r |
| 1822 | \r |
| 1823 | [79] https://community.kde.org/KDE/Junior_Jobs\r |
| 1824 | \r |
| 1825 | [80] https://lwn.net/Articles/763189/\r |
| 1826 | \r |
| 1827 | [81] https://phabricator.kde.org/T8686\r |
| 1828 | \r |
| 1829 | [82] https://phabricator.kde.org/T7646\r |
| 1830 | \r |
| 1831 | [83] https://bugs.kde.org/\r |
| 1832 | \r |
| 1833 | [84] https://www.plasma-mobile.org/index.html\r |
| 1834 | \r |
| 1835 | [85] https://www.plasma-mobile.org/findyourway\r |
| 1836 | \r |
| 1837 | [86] https://lwn.net/Articles/763175/#Comments\r |
| 1838 | \r |
| 1839 | [87] https://lwn.net/Articles/763492/\r |
| 1840 | \r |
| 1841 | [88] https://datproject.org\r |
| 1842 | \r |
| 1843 | [89] https://www.bittorrent.com/\r |
| 1844 | \r |
| 1845 | [90] https://github.com/datproject/dat/releases\r |
| 1846 | \r |
| 1847 | [91] https://docs.datproject.org/install\r |
| 1848 | \r |
| 1849 | [92] https://datbase.org/\r |
| 1850 | \r |
| 1851 | [93] https://ed25519.cr.yp.to/\r |
| 1852 | \r |
| 1853 | [94] https://en.wikipedia.org/wiki/Mainline_DHT\r |
| 1854 | \r |
| 1855 | [95] https://github.com/mafintosh/dns-discovery\r |
| 1856 | \r |
| 1857 | [96] https://en.wikipedia.org/wiki/Magnet_URI_scheme\r |
| 1858 | \r |
| 1859 | [97] https://blog.datproject.org/2017/10/13/using-dat-for-auto-\r |
| 1860 | matic-file-backups/\r |
| 1861 | \r |
| 1862 | [98] https://github.com/mafintosh/hypercore-archiver\r |
| 1863 | \r |
| 1864 | [99] https://ipfs.io/\r |
| 1865 | \r |
| 1866 | [100] https://github.com/ipfs/go-ipfs/issues/875\r |
| 1867 | \r |
| 1868 | [101] https://github.com/ipfs/go-ipfs/blob/master/docs/experim-\r |
| 1869 | ental-features.md#ipfs-filestore\r |
| 1870 | \r |
| 1871 | [102] https://hashbase.io/\r |
| 1872 | \r |
| 1873 | [103] https://github.com/datprotocol/DEPs/blob/master/proposal-\r |
| 1874 | s/0003-http-pinning-service-api.md\r |
| 1875 | \r |
| 1876 | [104] https://docs.datproject.org/server\r |
| 1877 | \r |
| 1878 | [105] https://lwn.net/Articles/763544/\r |
| 1879 | \r |
| 1880 | [106] https://beakerbrowser.com/\r |
| 1881 | \r |
| 1882 | [107] https://electronjs.org/\r |
| 1883 | \r |
| 1884 | [108] https://github.com/beakerbrowser/explore\r |
| 1885 | \r |
| 1886 | [109] https://addons.mozilla.org/en-US/firefox/addon/dat-p2p-p-\r |
| 1887 | rotocol/\r |
| 1888 | \r |
| 1889 | [110] https://github.com/sammacbeth/dat-fox\r |
| 1890 | \r |
| 1891 | [111] https://github.com/sammacbeth/dat-fox-helper\r |
| 1892 | \r |
| 1893 | [112] https://github.com/beakerbrowser/dat-photos-app\r |
| 1894 | \r |
| 1895 | [113] https://github.com/datproject/docs/raw/master/papers/dat-\r |
| 1896 | paper.pdf\r |
| 1897 | \r |
| 1898 | [114] https://github.com/datprotocol/DEPs/blob/653e0cf40233b5d-\r |
| 1899 | 474cddc04235577d9d55b2934/proposals/0000-peer-discovery.md#dis-\r |
| 1900 | covery-keys\r |
| 1901 | \r |
| 1902 | [115] https://docs.datproject.org/security\r |
| 1903 | \r |
| 1904 | [116] https://blog.datproject.org/2016/12/12/reader-privacy-on-\r |
| 1905 | the-p2p-web/\r |
| 1906 | \r |
| 1907 | [117] https://blog.datproject.org/2017/12/10/dont-ship/\r |
| 1908 | \r |
| 1909 | [118] https://github.com/datprotocol/DEPs/pull/7\r |
| 1910 | \r |
| 1911 | [119] https://blog.datproject.org/2017/06/01/dat-sleep-release/\r |
| 1912 | \r |
| 1913 | [120] https://github.com/datprotocol/DEPs\r |
| 1914 | \r |
| 1915 | [121] https://github.com/datprotocol/DEPs/blob/master/proposal-\r |
| 1916 | s/0008-multiwriter.md\r |
| 1917 | \r |
| 1918 | [122] https://github.com/mafintosh/hyperdb\r |
| 1919 | \r |
| 1920 | [123] https://codeforscience.org/\r |
| 1921 | \r |
| 1922 | [124] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890565\r |
| 1923 | \r |
| 1924 | [125] https://github.com/datrs\r |
| 1925 | \r |
| 1926 | [126] https://nodejs.org/en/\r |
| 1927 | \r |
| 1928 | [127] https://bunsenbrowser.github.io/#!index.md\r |
| 1929 | \r |
| 1930 | [128] https://termux.com/\r |
| 1931 | \r |
| 1932 | [129] https://bluelinklabs.com/\r |
| 1933 | \r |
| 1934 | [130] https://www.digital-democracy.org/\r |
| 1935 | \r |
| 1936 | [131] https://archive.org\r |
| 1937 | \r |
| 1938 | [132] https://blog.archive.org/2018/06/05/internet-archive-cod-\r |
| 1939 | e-for-science-and-society-and-california-digital-library-to-pa-\r |
| 1940 | rtner-on-a-data-sharing-and-preservation-pilot-project/\r |
| 1941 | \r |
| 1942 | [133] https://github.com/codeforscience/Dat-in-the-Lab\r |
| 1943 | \r |
| 1944 | [134] https://www.mozilla.org/en-US/moss/\r |
| 1945 | \r |
| 1946 | [135] https://github.com/datprotocol/DEPs/blob/master/proposal-\r |
| 1947 | s/0005-dns.md\r |
| 1948 | \r |
| 1949 | [136] https://lwn.net/Articles/763492/#Comments\r |
| 1950 | \r |
| 1951 | [137] https://lwn.net/Articles/763254/\r |
| 1952 | \r |
| 1953 | [138] https://lwn.net/Articles/763255/\r |
| 1954 | \r |
| 1955 | [139] https://lwn.net/Articles/763254/\r |
| 1956 | \r |
| 1957 | \r |
| 1958 | \r |