• 1 Post
  • 28 Comments
Joined 1 year ago
cake
Cake day: July 9th, 2023

help-circle


  • If not vanilla Ubuntu, I’d still suggest trying an Ubuntu derivative like Linux Mint or POP! OS. Ubuntu has a huge community, so in the event you run into issues it’ll be easier to find fixes for it.

    What you’ll find is that Linux distros are roughly grouped by a “family” (my term for it anyway). Anyone can (theoretically, anyway) start from a given kernel and roll their own distro, but most distros are modified versions of a handful of base distros.

    The major families at the moment are

    • Debian: A classic all-rounder that prioritizes stability over all else. Ubuntu is descended from Debian.

    • Fedora: Another classic all-rounder. I haven’t used it in a decade, so I won’t say much about it here.

    • Arch: If Linux nerds were car people, Arch is for the hot rodders. You can tune and control pretty much any aspect of your system. … Not a good 1st distro if you want to just get something going.

    There are many others, but these are the major desktop-PC distro families at the moment.

    The importance of these families is that techniques that work in one (say) Debian-based distro will tend to work in other Debian-based distros… But not necessarily in distros from other families.




  • It’s not as good, but running small LLMs locally can work. I’ve been messing around with ollama, which makes it drop dead simple to try out different models locally.

    You won’t be running any model as powerful as ChatGPT - but for quick “stack overflow replacement” style of questions I find it’s usually good enough.

    And before you write off the idea of local models completely, some recent studies indicate that our current models could be made orders of magnitude smaller for the same level of capability. Think Moore’s law but for shrinking the required connections within a model. I do believe we’ll be able to run GPT3.5-level models on consumer grade hardware in the very near future. (Of course, by then GPT-7 may be running the world but we live in hope).


  • I used to work for an imaging satellite company. And yes - spy satellites are crazy powerful. The real problem is one of bandwidth. Crazy powerful spy satellites are expensive - and there aren’t a lot of them.

    So everybody is competing for time on them. Satellite images have been traditionally expensive and rare. We web intelligence agencies have to take turns and sometimes miss important events due to scheduling or timing conflicts.

    The thing these new satellites offer is broad coverage. When you have a few hundred small-sats there’s just many, many more opportunities to have eyes on the part of the world you’re interested in.

    All that said, you want to pay attention to the resolution of the images. The place I worked for was providing imagery about 1-meter resolution. E.g. each pixel in the image corresponded to about 1sq-meter of earth. We figured this was a good compromise between image quality and privacy. Enough to count cars, see weather patterns, make out groups of people, but identifying any given person was right out.

    So if you see an imaging company throwing a bazillion imaging small-sats up - its worth checking what their reported resolution is. 0.5m means a real tall dude would still only be 2 pixels. But 1cm resolution means you could count their teeth.


  • Fwiw, I setup my pihole at home using docker. I run a full size desktop as my all-the-things server and use it as a docker host. Makes managing my services much easier.

    I could, of course, use an actual raspi for this, but I run a bunch of other services - including my plex host and file server - on the same machine. Using docker makes it dead easy to update my various services as needed and no worries about dependency Hell between them.


  • pushes glasses up nose Ackchually…

    The recent CPRA regulation in CA has essentially mandated automated data deletion requests. Technically it only applies to CA residents, but it’s so hard to disprove residency that most companies will process requests from anybody.

    It only went into effect last year, but yeah - everybody I’m aware of has implemented an api for processing requests.

    I think $9/mo is pretty fair to cover paying for the engineering and infrastructure to support their ongoing integration efforts.

    That said, you could absolutely build something yourself that sends automated requests to every data broker you can find, but… Mozilla already knows where they are and will be looking for more. It’s going to become a game of whack a mole as companies that haven’t received deletion requests will have more complete (and thus more valuable) data sets.

    If you don’t want to just leave it on though - just this a couple times a year as a sort of spring-cleaning event should cut down your presence on ad rolls significantly.


  • I work in an advertising-adjacent industry. My company doesn’t collect data ourselves, but we do purchase and use advertising data on behalf of our direct customers.

    First off, there’s no single “advertising id” in use across the industry. Some companies make up their own, some companies don’t have one at all. Several companies just link by your email address.

    You may be interested to know that the CPRA legislation in CA from 2023 has made it a legal requirement to allow customers to request that businesses:

    a) disclose what data they have about you

    b) allow you to delete your data

    … and a few other things.

    Technically, this only applies to CA residents, but (dis)proving residency is hard enough that most companies will just accept your request regardless of where you live.

    If you poke around, you should be able to find a way to submit CPRA requests to any given advertising company to request to see your data.

    This comes with a big caveat though - the Stalker Problem. What if some asshole goes to AdSense and says “My name is totally Jane Doe, what do you know about me? Recent addresses, especially.” … That gets into scary waters quick.

    The compromise many places have landed on is to confirm what they know about a person, but not volunteer any extra info. E.g. “I’m Jane Doe - what do you know about me?” -> “We know about Jane Doe.” or “We know nothing about Jane Doe.” (and if you provide email addresses etc, those may be individually confirmed or denied.)

    There’s a new framework of intermediaries popping up that will automatically submit your info for deletion across the industry, so if you sign up for one of those you can have your data regularly cleared.




  • I don’t disagree that it would be great if it were easier for non-techy people to try Linux. But we’re speaking in the context of an OP who said that they’re not afraid of using the console, which indicates a certain level of technical skill to start from. They’ve asked for directions and I’m trying to provide them.

    Beside that, I think there’s a limit to how simple I - or anyone - can make the process of installing a new OS. That isn’t a “Linux” thing - there isn’t any simpler option if you want to install Windows from scratch either.

    If you want to get Linux in the layman’s hands as easily as most people get Windows, buy them machines from System76. Sorted.

    In the meantime, what would you suggest, vs my “wall of text […] of geeky jargon”?


  • Fair point!


    Making a Bootable USB stick: I like using balena etcher to make bootable USBs. It handles downloading, burning and making a bootable image for you. It’s great!

    Just point it at an empty USB stick (or one you’re ok with Balena erasing everything on) and select the Linux distro (or downloaded ISO) you want to use. Come back later and you’re all set to reboot into Linux from the USB stick!

    Booting From USB: You may need to manually select the USB stick as your boot device when you restart! If so, usually you just need to hold F8 during the reboot process to get the menu.

    If that doesn’t do it, you’ll need to get into your computer’s BIOS to enable booting from the USB. That usually requires holding down either F1, F12, or Delete, depending on your particular BIOS. I usually just hold them all down.

    In the BIOS you’re looking for something like “Boot” or “Boot Order”, “Disk Devices” etc. It may be hidden inside an “Advanced Options” or “Security” section.

    Once you’ve found it, make sure your USB drive is A) enabled for booting from B) in the boot list before your other drives

    After that, Save and Exit your BIOS (methods vary, but it’s usually written on the screen someplace).


  • Yup, agreed on all counts.

    I just feel that if it’s your first distro, it’s probably better to stick to vanilla Ubuntu until you better understand the subtle differences between the various Debians.

    Still and all though, it’s easier to install a Linux than it’s ever been. My first Linux was actually an OpenSUSE, soon replaced by Debian Etch. I bought the latter online and they mailed me the installation CDs! It took me days to get the installs working.

    Now, you just pop in a USB and follow the friendly install wizard. It’s friggin awesome.


  • Ubuntu is a decent place to start.

    Before anybody decides to jump down my throat over it, there are some very good reasons to not use Ubuntu generally. I know.

    That said, I still recommend it as a first distro because it’s

    • well supported - if someone puts out Linux support, it’s likely been tested on Ubuntu.
    • simple to install - everything from WSL to a live boot USB drive to a full install, you’ve got lots of options
    • pragmatic - yes, it’s compromised vs being truly FOSS. Otoh, your consumer grade Windows-supported hardware will likely work out of the box. For a first timer, I think that’s critical.

    There are many other, better distros out there for specific needs. Manjaro is a great one for gaming in particular, but can be a little harder to get setup with, or to find help for when things go wrong. But I still think Ubuntu is the best “starter” distro I’ve encountered.






  • I work in an adjacent industry. Establishing “is this person actually a resident of X” is really hard. It’s much easier to just allow everyone to submit CPPA/CPRA requests.

    So that’s what everybody does.

    Just because it’s only required in CA doesn’t mean you won’t be able to make use of it!

    Edit: At the start of 2023 the CPRA already established that CA residents (which really became anybody) can request their data be deleted. It looks like this new bill just mandates a central location to transmit those requests out to everybody from.

    There are already services that do this (The article mentions Delete Me) for a fee. This is going to eat their lunch, but is going to be a major win for privacy!